Minimum variance geographic sampling
NASA Technical Reports Server (NTRS)
Terrell, G. R. (Principal Investigator)
1980-01-01
Resource inventories require samples with geographical scatter, sometimes not as widely spaced as would be hoped. A simple model of correlation over distances is used to create a minimum variance unbiased estimate population means. The fitting procedure is illustrated from data used to estimate Missouri corn acreage.
R. L. Czaplewski
2009-01-01
The minimum variance multivariate composite estimator is a relatively simple sequential estimator for complex sampling designs (Czaplewski 2009). Such designs combine a probability sample of expensive field data with multiple censuses and/or samples of relatively inexpensive multi-sensor, multi-resolution remotely sensed data. Unfortunately, the multivariate composite...
Some refinements on the comparison of areal sampling methods via simulation
Jeffrey Gove
2017-01-01
The design of forest inventories and development of new sampling methods useful in such inventories normally have a two-fold target of design unbiasedness and minimum variance in mind. Many considerations such as costs go into the choices of sampling method for operational and other levels of inventory. However, the variance in terms of meeting a specified level of...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, M.
1980-12-01
The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that themore » use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates.« less
Winslow, Stephen D; Pepich, Barry V; Martin, John J; Hallberg, George R; Munch, David J; Frebis, Christopher P; Hedrick, Elizabeth J; Krop, Richard A
2006-01-01
The United States Environmental Protection Agency's Office of Ground Water and Drinking Water has developed a single-laboratory quantitation procedure: the lowest concentration minimum reporting level (LCMRL). The LCMRL is the lowest true concentration for which future recovery is predicted to fall, with high confidence (99%), between 50% and 150%. The procedure takes into account precision and accuracy. Multiple concentration replicates are processed through the entire analytical method and the data are plotted as measured sample concentration (y-axis) versus true concentration (x-axis). If the data support an assumption of constant variance over the concentration range, an ordinary least-squares regression line is drawn; otherwise, a variance-weighted least-squares regression is used. Prediction interval lines of 99% confidence are drawn about the regression. At the points where the prediction interval lines intersect with data quality objective lines of 50% and 150% recovery, lines are dropped to the x-axis. The higher of the two values is the LCMRL. The LCMRL procedure is flexible because the data quality objectives (50-150%) and the prediction interval confidence (99%) can be varied to suit program needs. The LCMRL determination is performed during method development only. A simpler procedure for verification of data quality objectives at a given minimum reporting level (MRL) is also presented. The verification procedure requires a single set of seven samples taken through the entire method procedure. If the calculated prediction interval is contained within data quality recovery limits (50-150%), the laboratory performance at the MRL is verified.
Diallel analysis for sex-linked and maternal effects.
Zhu, J; Weir, B S
1996-01-01
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.
Superresolution SAR Imaging Algorithm Based on Mvm and Weighted Norm Extrapolation
NASA Astrophysics Data System (ADS)
Zhang, P.; Chen, Q.; Li, Z.; Tang, Z.; Liu, J.; Zhao, L.
2013-08-01
In this paper, we present an extrapolation approach, which uses minimum weighted norm constraint and minimum variance spectrum estimation, for improving synthetic aperture radar (SAR) resolution. Minimum variance method is a robust high resolution method to estimate spectrum. Based on the theory of SAR imaging, the signal model of SAR imagery is analyzed to be feasible for using data extrapolation methods to improve the resolution of SAR image. The method is used to extrapolate the efficient bandwidth in phase history field and better results are obtained compared with adaptive weighted norm extrapolation (AWNE) method and traditional imaging method using simulated data and actual measured data.
Approximate sample size formulas for the two-sample trimmed mean test with unequal variances.
Luh, Wei-Ming; Guo, Jiin-Huarng
2007-05-01
Yuen's two-sample trimmed mean test statistic is one of the most robust methods to apply when variances are heterogeneous. The present study develops formulas for the sample size required for the test. The formulas are applicable for the cases of unequal variances, non-normality and unequal sample sizes. Given the specified alpha and the power (1-beta), the minimum sample size needed by the proposed formulas under various conditions is less than is given by the conventional formulas. Moreover, given a specified size of sample calculated by the proposed formulas, simulation results show that Yuen's test can achieve statistical power which is generally superior to that of the approximate t test. A numerical example is provided.
Portfolio optimization with mean-variance model
NASA Astrophysics Data System (ADS)
Hoe, Lam Weng; Siew, Lam Weng
2016-06-01
Investors wish to achieve the target rate of return at the minimum level of risk in their investment. Portfolio optimization is an investment strategy that can be used to minimize the portfolio risk and can achieve the target rate of return. The mean-variance model has been proposed in portfolio optimization. The mean-variance model is an optimization model that aims to minimize the portfolio risk which is the portfolio variance. The objective of this study is to construct the optimal portfolio using the mean-variance model. The data of this study consists of weekly returns of 20 component stocks of FTSE Bursa Malaysia Kuala Lumpur Composite Index (FBMKLCI). The results of this study show that the portfolio composition of the stocks is different. Moreover, investors can get the return at minimum level of risk with the constructed optimal mean-variance portfolio.
Estimation of transformation parameters for microarray data.
Durbin, Blythe; Rocke, David M
2003-07-22
Durbin et al. (2002), Huber et al. (2002) and Munson (2001) independently introduced a family of transformations (the generalized-log family) which stabilizes the variance of microarray data up to the first order. We introduce a method for estimating the transformation parameter in tandem with a linear model based on the procedure outlined in Box and Cox (1964). We also discuss means of finding transformations within the generalized-log family which are optimal under other criteria, such as minimum residual skewness and minimum mean-variance dependency. R and Matlab code and test data are available from the authors on request.
Petruzzellis, Francesco; Palandrani, Chiara; Savi, Tadeja; Alberti, Roberto; Nardini, Andrea; Bacaro, Giovanni
2017-12-01
The choice of the best sampling strategy to capture mean values of functional traits for a species/population, while maintaining information about traits' variability and minimizing the sampling size and effort, is an open issue in functional trait ecology. Intraspecific variability (ITV) of functional traits strongly influences sampling size and effort. However, while adequate information is available about intraspecific variability between individuals (ITV BI ) and among populations (ITV POP ), relatively few studies have analyzed intraspecific variability within individuals (ITV WI ). Here, we provide an analysis of ITV WI of two foliar traits, namely specific leaf area (SLA) and osmotic potential (π), in a population of Quercus ilex L. We assessed the baseline ITV WI level of variation between the two traits and provided the minimum and optimal sampling size in order to take into account ITV WI , comparing sampling optimization outputs with those previously proposed in the literature. Different factors accounted for different amount of variance of the two traits. SLA variance was mostly spread within individuals (43.4% of the total variance), while π variance was mainly spread between individuals (43.2%). Strategies that did not account for all the canopy strata produced mean values not representative of the sampled population. The minimum size to adequately capture the studied functional traits corresponded to 5 leaves taken randomly from 5 individuals, while the most accurate and feasible sampling size was 4 leaves taken randomly from 10 individuals. We demonstrate that the spatial structure of the canopy could significantly affect traits variability. Moreover, different strategies for different traits could be implemented during sampling surveys. We partially confirm sampling sizes previously proposed in the recent literature and encourage future analysis involving different traits.
A Robust Statistics Approach to Minimum Variance Portfolio Optimization
NASA Astrophysics Data System (ADS)
Yang, Liusha; Couillet, Romain; McKay, Matthew R.
2015-12-01
We study the design of portfolios under a minimum risk criterion. The performance of the optimized portfolio relies on the accuracy of the estimated covariance matrix of the portfolio asset returns. For large portfolios, the number of available market returns is often of similar order to the number of assets, so that the sample covariance matrix performs poorly as a covariance estimator. Additionally, financial market data often contain outliers which, if not correctly handled, may further corrupt the covariance estimation. We address these shortcomings by studying the performance of a hybrid covariance matrix estimator based on Tyler's robust M-estimator and on Ledoit-Wolf's shrinkage estimator while assuming samples with heavy-tailed distribution. Employing recent results from random matrix theory, we develop a consistent estimator of (a scaled version of) the realized portfolio risk, which is minimized by optimizing online the shrinkage intensity. Our portfolio optimization method is shown via simulations to outperform existing methods both for synthetic and real market data.
Patterns and Prevalence of Core Profile Types in the WPPSI Standardization Sample.
ERIC Educational Resources Information Center
Glutting, Joseph J.; McDermott, Paul A.
1990-01-01
Found most representative subtest profiles for 1,200 children comprising standardization sample of Wechsler Preschool and Primary Scale of Intelligence (WPPSI). Grouped scaled scores from WPPSI subtests according to similar level and shape using sequential minimum-variance cluster analysis with independent replications. Obtained final solution of…
Post-stratified estimation: with-in strata and total sample size recommendations
James A. Westfall; Paul L. Patterson; John W. Coulston
2011-01-01
Post-stratification is used to reduce the variance of estimates of the mean. Because the stratification is not fixed in advance, within-strata sample sizes can be quite small. The survey statistics literature provides some guidance on minimum within-strata sample sizes; however, the recommendations and justifications are inconsistent and apply broadly for many...
Bernard R. Parresol
1993-01-01
In the context of forest modeling, it is often reasonable to assume a multiplicative heteroscedastic error structure to the data. Under such circumstances ordinary least squares no longer provides minimum variance estimates of the model parameters. Through study of the error structure, a suitable error variance model can be specified and its parameters estimated. This...
MSEBAG: a dynamic classifier ensemble generation based on `minimum-sufficient ensemble' and bagging
NASA Astrophysics Data System (ADS)
Chen, Lei; Kamel, Mohamed S.
2016-01-01
In this paper, we propose a dynamic classifier system, MSEBAG, which is characterised by searching for the 'minimum-sufficient ensemble' and bagging at the ensemble level. It adopts an 'over-generation and selection' strategy and aims to achieve a good bias-variance trade-off. In the training phase, MSEBAG first searches for the 'minimum-sufficient ensemble', which maximises the in-sample fitness with the minimal number of base classifiers. Then, starting from the 'minimum-sufficient ensemble', a backward stepwise algorithm is employed to generate a collection of ensembles. The objective is to create a collection of ensembles with a descending fitness on the data, as well as a descending complexity in the structure. MSEBAG dynamically selects the ensembles from the collection for the decision aggregation. The extended adaptive aggregation (EAA) approach, a bagging-style algorithm performed at the ensemble level, is employed for this task. EAA searches for the competent ensembles using a score function, which takes into consideration both the in-sample fitness and the confidence of the statistical inference, and averages the decisions of the selected ensembles to label the test pattern. The experimental results show that the proposed MSEBAG outperforms the benchmarks on average.
Quantizing and sampling considerations in digital phased-locked loops
NASA Technical Reports Server (NTRS)
Hurst, G. T.; Gupta, S. C.
1974-01-01
The quantizer problem is first considered. The conditions under which the uniform white sequence model for the quantizer error is valid are established independent of the sampling rate. An equivalent spectral density is defined for the quantizer error resulting in an effective SNR value. This effective SNR may be used to determine quantized performance from infinitely fine quantized results. Attention is given to sampling rate considerations. Sampling rate characteristics of the digital phase-locked loop (DPLL) structure are investigated for the infinitely fine quantized system. The predicted phase error variance equation is examined as a function of the sampling rate. Simulation results are presented and a method is described which enables the minimum required sampling rate to be determined from the predicted phase error variance equations.
NASA Technical Reports Server (NTRS)
Meneghini, Robert; Kim, Hyokyung
2016-01-01
For an airborne or spaceborne radar, the precipitation-induced path attenuation can be estimated from the measurements of the normalized surface cross section, sigma 0, in the presence and absence of precipitation. In one implementation, the mean rain-free estimate and its variability are found from a lookup table (LUT) derived from previously measured data. For the dual-frequency precipitation radar aboard the global precipitation measurement satellite, the nominal table consists of the statistics of the rain-free 0 over a 0.5 deg x 0.5 deg latitude-longitude grid using a three-month set of input data. However, a problem with the LUT is an insufficient number of samples in many cells. An alternative table is constructed by a stepwise procedure that begins with the statistics over a 0.25 deg x 0.25 deg grid. If the number of samples at a cell is too few, the area is expanded, cell by cell, choosing at each step that cell that minimizes the variance of the data. The question arises, however, as to whether the selected region corresponds to the smallest variance. To address this question, a second type of variable-averaging grid is constructed using all possible spatial configurations and computing the variance of the data within each region. Comparisons of the standard deviations for the fixed and variable-averaged grids are given as a function of incidence angle and surface type using a three-month set of data. The advantage of variable spatial averaging is that the average standard deviation can be reduced relative to the fixed grid while satisfying the minimum sample requirement.
Maximum Likelihood and Minimum Distance Applied to Univariate Mixture Distributions.
ERIC Educational Resources Information Center
Wang, Yuh-Yin Wu; Schafer, William D.
This Monte-Carlo study compared modified Newton (NW), expectation-maximization algorithm (EM), and minimum Cramer-von Mises distance (MD), used to estimate parameters of univariate mixtures of two components. Data sets were fixed at size 160 and manipulated by mean separation, variance ratio, component proportion, and non-normality. Results…
Zeng, Xing; Chen, Cheng; Wang, Yuanyuan
2012-12-01
In this paper, a new beamformer which combines the eigenspace-based minimum variance (ESBMV) beamformer with the Wiener postfilter is proposed for medical ultrasound imaging. The primary goal of this work is to further improve the medical ultrasound imaging quality on the basis of the ESBMV beamformer. In this method, we optimize the ESBMV weights with a Wiener postfilter. With the optimization of the Wiener postfilter, the output power of the new beamformer becomes closer to the actual signal power at the imaging point than the ESBMV beamformer. Different from the ordinary Wiener postfilter, the output signal and noise power needed in calculating the Wiener postfilter are estimated respectively by the orthogonal signal subspace and noise subspace constructed from the eigenstructure of the sample covariance matrix. We demonstrate the performance of the new beamformer when resolving point scatterers and cyst phantom using both simulated data and experimental data and compare it with the delay-and-sum (DAS), the minimum variance (MV) and the ESBMV beamformer. We use the full width at half maximum (FWHM) and the peak-side-lobe level (PSL) to quantify the performance of imaging resolution and the contrast ratio (CR) to quantify the performance of imaging contrast. The FWHM of the new beamformer is only 15%, 50% and 50% of those of the DAS, MV and ESBMV beamformer, while the PSL is 127.2dB, 115dB and 60dB lower. What is more, an improvement of 239.8%, 232.5% and 32.9% in CR using simulated data and an improvement of 814%, 1410.7% and 86.7% in CR using experimental data are achieved compared to the DAS, MV and ESBMV beamformer respectively. In addition, the effect of the sound speed error is investigated by artificially overestimating the speed used in calculating the propagation delay and the results show that the new beamformer provides better robustness against the sound speed errors. Therefore, the proposed beamformer offers a better performance than the DAS, MV and ESBMV beamformer, showing its potential in medical ultrasound imaging. Copyright © 2012 Elsevier B.V. All rights reserved.
Large amplitude MHD waves upstream of the Jovian bow shock
NASA Technical Reports Server (NTRS)
Goldstein, M. L.; Smith, C. W.; Matthaeus, W. H.
1983-01-01
Observations of large amplitude magnetohydrodynamics (MHD) waves upstream of Jupiter's bow shock are analyzed. The waves are found to be right circularly polarized in the solar wind frame which suggests that they are propagating in the fast magnetosonic mode. A complete spectral and minimum variance eigenvalue analysis of the data was performed. The power spectrum of the magnetic fluctuations contains several peaks. The fluctuations at 2.3 mHz have a direction of minimum variance along the direction of the average magnetic field. The direction of minimum variance of these fluctuations lies at approximately 40 deg. to the magnetic field and is parallel to the radial direction. We argue that these fluctuations are waves excited by protons reflected off the Jovian bow shock. The inferred speed of the reflected protons is about two times the solar wind speed in the plasma rest frame. A linear instability analysis is presented which suggests an explanation for many of the observed features of the observations.
GIS-based niche modeling for mapping species' habitats
Rotenberry, J.T.; Preston, K.L.; Knick, S.
2006-01-01
Ecological a??niche modelinga?? using presence-only locality data and large-scale environmental variables provides a powerful tool for identifying and mapping suitable habitat for species over large spatial extents. We describe a niche modeling approach that identifies a minimum (rather than an optimum) set of basic habitat requirements for a species, based on the assumption that constant environmental relationships in a species' distribution (i.e., variables that maintain a consistent value where the species occurs) are most likely to be associated with limiting factors. Environmental variables that take on a wide range of values where a species occurs are less informative because they do not limit a species' distribution, at least over the range of variation sampled. This approach is operationalized by partitioning Mahalanobis D2 (standardized difference between values of a set of environmental variables for any point and mean values for those same variables calculated from all points at which a species was detected) into independent components. The smallest of these components represents the linear combination of variables with minimum variance; increasingly larger components represent larger variances and are increasingly less limiting. We illustrate this approach using the California Gnatcatcher (Polioptila californica Brewster) and provide SAS code to implement it.
NASA Astrophysics Data System (ADS)
Dilla, Shintia Ulfa; Andriyana, Yudhie; Sudartianto
2017-03-01
Acid rain causes many bad effects in life. It is formed by two strong acids, sulfuric acid (H2SO4) and nitric acid (HNO3), where sulfuric acid is derived from SO2 and nitric acid from NOx {x=1,2}. The purpose of the research is to find out the influence of So4 and NO3 levels contained in the rain to the acidity (pH) of rainwater. The data are incomplete panel data with two-way error component model. The panel data is a collection of some of the observations that observed from time to time. It is said incomplete if each individual has a different amount of observation. The model used in this research is in the form of random effects model (REM). Minimum variance quadratic unbiased estimation (MIVQUE) is used to estimate the variance error components, while maximum likelihood estimation is used to estimate the parameters. As a result, we obtain the following model: Ŷ* = 0.41276446 - 0.00107302X1 + 0.00215470X2.
Overlap between treatment and control distributions as an effect size measure in experiments.
Hedges, Larry V; Olkin, Ingram
2016-03-01
The proportion π of treatment group observations that exceed the control group mean has been proposed as an effect size measure for experiments that randomly assign independent units into 2 groups. We give the exact distribution of a simple estimator of π based on the standardized mean difference and use it to study the small sample bias of this estimator. We also give the minimum variance unbiased estimator of π under 2 models, one in which the variance of the mean difference is known and one in which the variance is unknown. We show how to use the relation between the standardized mean difference and the overlap measure to compute confidence intervals for π and show that these results can be used to obtain unbiased estimators, large sample variances, and confidence intervals for 3 related effect size measures based on the overlap. Finally, we show how the effect size π can be used in a meta-analysis. (c) 2016 APA, all rights reserved).
Cruz-Ramírez, Nicandro; Acosta-Mesa, Héctor Gabriel; Mezura-Montes, Efrén; Guerra-Hernández, Alejandro; Hoyos-Rivera, Guillermo de Jesús; Barrientos-Martínez, Rocío Erandi; Gutiérrez-Fragoso, Karina; Nava-Fernández, Luis Alonso; González-Gaspar, Patricia; Novoa-del-Toro, Elva María; Aguilera-Rueda, Vicente Josué; Ameca-Alducin, María Yaneli
2014-01-01
The bias-variance dilemma is a well-known and important problem in Machine Learning. It basically relates the generalization capability (goodness of fit) of a learning method to its corresponding complexity. When we have enough data at hand, it is possible to use these data in such a way so as to minimize overfitting (the risk of selecting a complex model that generalizes poorly). Unfortunately, there are many situations where we simply do not have this required amount of data. Thus, we need to find methods capable of efficiently exploiting the available data while avoiding overfitting. Different metrics have been proposed to achieve this goal: the Minimum Description Length principle (MDL), Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC), among others. In this paper, we focus on crude MDL and empirically evaluate its performance in selecting models with a good balance between goodness of fit and complexity: the so-called bias-variance dilemma, decomposition or tradeoff. Although the graphical interaction between these dimensions (bias and variance) is ubiquitous in the Machine Learning literature, few works present experimental evidence to recover such interaction. In our experiments, we argue that the resulting graphs allow us to gain insights that are difficult to unveil otherwise: that crude MDL naturally selects balanced models in terms of bias-variance, which not necessarily need be the gold-standard ones. We carry out these experiments using a specific model: a Bayesian network. In spite of these motivating results, we also should not overlook three other components that may significantly affect the final model selection: the search procedure, the noise rate and the sample size.
Cruz-Ramírez, Nicandro; Acosta-Mesa, Héctor Gabriel; Mezura-Montes, Efrén; Guerra-Hernández, Alejandro; Hoyos-Rivera, Guillermo de Jesús; Barrientos-Martínez, Rocío Erandi; Gutiérrez-Fragoso, Karina; Nava-Fernández, Luis Alonso; González-Gaspar, Patricia; Novoa-del-Toro, Elva María; Aguilera-Rueda, Vicente Josué; Ameca-Alducin, María Yaneli
2014-01-01
The bias-variance dilemma is a well-known and important problem in Machine Learning. It basically relates the generalization capability (goodness of fit) of a learning method to its corresponding complexity. When we have enough data at hand, it is possible to use these data in such a way so as to minimize overfitting (the risk of selecting a complex model that generalizes poorly). Unfortunately, there are many situations where we simply do not have this required amount of data. Thus, we need to find methods capable of efficiently exploiting the available data while avoiding overfitting. Different metrics have been proposed to achieve this goal: the Minimum Description Length principle (MDL), Akaike’s Information Criterion (AIC) and Bayesian Information Criterion (BIC), among others. In this paper, we focus on crude MDL and empirically evaluate its performance in selecting models with a good balance between goodness of fit and complexity: the so-called bias-variance dilemma, decomposition or tradeoff. Although the graphical interaction between these dimensions (bias and variance) is ubiquitous in the Machine Learning literature, few works present experimental evidence to recover such interaction. In our experiments, we argue that the resulting graphs allow us to gain insights that are difficult to unveil otherwise: that crude MDL naturally selects balanced models in terms of bias-variance, which not necessarily need be the gold-standard ones. We carry out these experiments using a specific model: a Bayesian network. In spite of these motivating results, we also should not overlook three other components that may significantly affect the final model selection: the search procedure, the noise rate and the sample size. PMID:24671204
A comparison of coronal and interplanetary current sheet inclinations
NASA Technical Reports Server (NTRS)
Behannon, K. W.; Burlaga, L. F.; Hundhausen, A. J.
1983-01-01
The HAO white light K-coronameter observations show that the inclination of the heliospheric current sheet at the base of the corona can be both large (nearly vertical with respect to the solar equator) or small during Cararington rotations 1660 - 1666 and even on a single solar rotation. Voyager 1 and 2 magnetic field observations of crossing of the heliospheric current sheet at distances from the Sun of 1.4 and 2.8 AU. Two cases are considered, one in which the corresponding coronameter data indicate a nearly vertical (north-south) current sheet and another in which a nearly horizontal, near equatorial current sheet is indicated. For the crossings of the vertical current sheet, a variance analysis based on hour averages of the magnetic field data gave a minimum variance direction consistent with a steep inclination. The horizontal current sheet was observed by Voyager as a region of mixed polarity and low speeds lasting several days, consistent with multiple crossings of a horizontal but irregular and fluctuating current sheet at 1.4 AU. However, variance analysis of individual current sheet crossings in this interval using 1.92 see averages did not give minimum variance directions consistent with a horizontal current sheet.
Obtaining Reliable Predictions of Terrestrial Energy Coupling From Real-Time Solar Wind Measurement
NASA Technical Reports Server (NTRS)
Weimer, Daniel R.
2001-01-01
The first draft of a manuscript titled "Variable time delays in the propagation of the interplanetary magnetic field" has been completed, for submission to the Journal of Geophysical Research. In the preparation of this manuscript all data and analysis programs had been updated to the highest temporal resolution possible, at 16 seconds or better. The program which computes the "measured" IMF propagation time delays from these data has also undergone another improvement. In another significant development, a technique has been developed in order to predict IMF phase plane orientations, and the resulting time delays, using only measurements from a single satellite at L1. The "minimum variance" method is used for this computation. Further work will be done on optimizing the choice of several parameters for the minimum variance calculation.
Todd, Helena; Mirawdeli, Avin; Costelloe, Sarah; Cavenagh, Penny; Davis, Stephen; Howell, Peter
2014-12-01
Riley stated that the minimum speech sample length necessary to compute his stuttering severity estimates was 200 syllables. This was investigated. Procedures supplied for the assessment of readers and non-readers were examined to see whether they give equivalent scores. Recordings of spontaneous speech samples from 23 young children (aged between 2 years 8 months and 6 years 3 months) and 31 older children (aged between 10 years 0 months and 14 years 7 months) were made. Riley's severity estimates were scored on extracts of different lengths. The older children provided spontaneous and read samples, which were scored for severity according to reader and non-reader procedures. Analysis of variance supported the use of 200-syllable-long samples as the minimum necessary for obtaining severity scores. There was no significant difference in SSI-3 scores for the older children when the reader and non-reader procedures were used. Samples that are 200-syllables long are the minimum that is appropriate for obtaining stable Riley's severity scores. The procedural variants provide similar severity scores.
van Breukelen, Gerard J P; Candel, Math J J M
2018-06-10
Cluster randomized trials evaluate the effect of a treatment on persons nested within clusters, where treatment is randomly assigned to clusters. Current equations for the optimal sample size at the cluster and person level assume that the outcome variances and/or the study costs are known and homogeneous between treatment arms. This paper presents efficient yet robust designs for cluster randomized trials with treatment-dependent costs and treatment-dependent unknown variances, and compares these with 2 practical designs. First, the maximin design (MMD) is derived, which maximizes the minimum efficiency (minimizes the maximum sampling variance) of the treatment effect estimator over a range of treatment-to-control variance ratios. The MMD is then compared with the optimal design for homogeneous variances and costs (balanced design), and with that for homogeneous variances and treatment-dependent costs (cost-considered design). The results show that the balanced design is the MMD if the treatment-to control cost ratio is the same at both design levels (cluster, person) and within the range for the treatment-to-control variance ratio. It still is highly efficient and better than the cost-considered design if the cost ratio is within the range for the squared variance ratio. Outside that range, the cost-considered design is better and highly efficient, but it is not the MMD. An example shows sample size calculation for the MMD, and the computer code (SPSS and R) is provided as supplementary material. The MMD is recommended for trial planning if the study costs are treatment-dependent and homogeneity of variances cannot be assumed. © 2018 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Yuan, Yuan-Yuan; Zhou, Yu-Bi; Sun, Jing; Deng, Juan; Bai, Ying; Wang, Jie; Lu, Xue-Feng
2017-06-01
The content of elements in fifteen different regions of Nitraria roborowskii samples were determined by inductively coupled plasma-atomic emission spectrometry(ICP-OES), and its elemental characteristics were analyzed by principal component analysis. The results indicated that 18 mineral elements were detected in N. roborowskii of which V cannot be detected. In addition, contents of Na, K and Ca showed high concentration. Ti showed maximum content variance, while K is minimum. Four principal components were gained from the original data. The cumulative variance contribution rate is 81.542% and the variance contribution of the first principal component was 44.997%, indicating that Cr, Fe, P and Ca were the characteristic elements of N. roborowskii.Thus, the established method was simple, precise and can be used for determination of mineral elements in N.roborowskii Kom. fruits. The elemental distribution characteristics among N.roborowskii fruits are related to geographical origins which were clearly revealed by PCA. All the results will provide good basis for comprehensive utilization of N.roborowskii. Copyright© by the Chinese Pharmaceutical Association.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-14
... for drought-based temporary variance of the reservoir elevations and minimum flow releases at the Dead... temporary variance to the reservoir elevation and minimum flow requirements at the Hoist Development. The...: (1) Releasing a minimum flow of 75 cubic feet per second (cfs) from the Hoist Reservoir, instead of...
RFI in hybrid loops - Simulation and experimental results.
NASA Technical Reports Server (NTRS)
Ziemer, R. E.; Nelson, D. R.; Raghavan, H. R.
1972-01-01
A digital simulation of an imperfect second-order hybrid phase-locked loop (HPLL) operating in radio frequency interference (RFI) is described. Its performance is characterized in terms of phase error variance and phase error probability density function (PDF). Monte-Carlo simulation is used to show that the HPLL can be superior to the conventional phase-locked loops in RFI backgrounds when minimum phase error variance is the goodness criterion. Similar experimentally obtained data are given in support of the simulation data.
Solar-cycle dependence of a model turbulence spectrum using IMP and ACE observations over 38 years
NASA Astrophysics Data System (ADS)
Burger, R. A.; Nel, A. E.; Engelbrecht, N. E.
2014-12-01
Ab initio modulation models require a number of turbulence quantities as input for any reasonable diffusion tensor. While turbulence transport models describe the radial evolution of such quantities, they in turn require observations in the inner heliosphere as input values. So far we have concentrated on solar minimum conditions (e.g. Engelbrecht and Burger 2013, ApJ), but are now looking at long-term modulation which requires turbulence data over at a least a solar magnetic cycle. As a start we analyzed 1-minute resolution data for the N-component of the magnetic field, from 1974 to 2012, covering about two solar magnetic cycles (initially using IMP and then ACE data). We assume a very simple three-stage power-law frequency spectrum, calculate the integral from the highest to the lowest frequency, and fit it to variances calculated with lags from 5 minutes to 80 hours. From the fit we then obtain not only the asymptotic variance at large lags, but also the spectral index of the inertial and the energy, as well as the breakpoint between the inertial and energy range (bendover scale) and between the energy and cutoff range (cutoff scale). All values given here are preliminary. The cutoff range is a constraint imposed in order to ensure a finite energy density; the spectrum is forced to be either flat or to decrease with decreasing frequency in this range. Given that cosmic rays sample magnetic fluctuations over long periods in their transport through the heliosphere, we average the spectra over at least 27 days. We find that the variance of the N-component has a clear solar cycle dependence, with smaller values (~6 nT2) during solar minimum and larger during solar maximum periods (~17 nT2), well correlated with the magnetic field magnitude (e.g. Smith et al. 2006, ApJ). Whereas the inertial range spectral index (-1.65 ± 0.06) does not show a significant solar cycle variation, the energy range index (-1.1 ± 0.3) seems to be anti-correlated with the variance (Bieber et al. 1993, JGR); both indices show close to normal distributions. In contrast, the variance (e.g. Burlaga and Ness, 1998, JGR), and both the bendover scale (see Ruiz et al. 2014, Solar Physics) and cutoff scale appear to be log-normal distributed.
Mixed model approaches for diallel analysis based on a bio-model.
Zhu, J; Weir, B S
1996-12-01
A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.
Mesoscale Gravity Wave Variances from AMSU-A Radiances
NASA Technical Reports Server (NTRS)
Wu, Dong L.
2004-01-01
A variance analysis technique is developed here to extract gravity wave (GW) induced temperature fluctuations from NOAA AMSU-A (Advanced Microwave Sounding Unit-A) radiance measurements. By carefully removing the instrument/measurement noise, the algorithm can produce reliable GW variances with the minimum detectable value as small as 0.1 K2. Preliminary analyses with AMSU-A data show GW variance maps in the stratosphere have very similar distributions to those found with the UARS MLS (Upper Atmosphere Research Satellite Microwave Limb Sounder). However, the AMSU-A offers better horizontal and temporal resolution for observing regional GW variability, such as activity over sub-Antarctic islands.
Analysis of conditional genetic effects and variance components in developmental genetics.
Zhu, J
1995-12-01
A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.
Analysis of Conditional Genetic Effects and Variance Components in Developmental Genetics
Zhu, J.
1995-01-01
A genetic model with additive-dominance effects and genotype X environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t - 1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects. PMID:8601500
The Variance of Solar Wind Magnetic Fluctuations: Solutions and Further Puzzles
NASA Technical Reports Server (NTRS)
Roberts, D. A.; Goldstein, M. L.
2006-01-01
We study the dependence of the variance directions of the magnetic field in the solar wind as a function of scale, radial distance, and Alfvenicity. The study resolves the question of why different studies have arrived at widely differing values for the maximum to minimum power (approximately equal to 3:1 up to approximately equal to 20:1). This is due to the decreasing anisotropy with increasing time interval chosen for the variance, and is a direct result of the "spherical polarization" of the waves which follows from the near constancy of |B|. The reason for the magnitude preserving evolution is still unresolved. Moreover, while the long-known tendency for the minimum variance to lie along the mean field also follows from this view (as shown by Barnes many years ago), there is no theory for why the minimum variance follows the field direction as the Parker angle changes. We show that this turning is quite generally true in Alfvenic regions over a wide range of heliocentric distances. The fact that nonAlfvenic regions, while still showing strong power anisotropies, tend to have a much broader range of angles between the minimum variance and the mean field makes it unlikely that the cause of the variance turning is to be found in a turbulence mechanism. There are no obvious alternative mechanisms, leaving us with another intriguing puzzle.
On the Likely Utility of Hybrid Weights Optimized for Variances in Hybrid Error Covariance Models
NASA Astrophysics Data System (ADS)
Satterfield, E.; Hodyss, D.; Kuhl, D.; Bishop, C. H.
2017-12-01
Because of imperfections in ensemble data assimilation schemes, one cannot assume that the ensemble covariance is equal to the true error covariance of a forecast. Previous work demonstrated how information about the distribution of true error variances given an ensemble sample variance can be revealed from an archive of (observation-minus-forecast, ensemble-variance) data pairs. Here, we derive a simple and intuitively compelling formula to obtain the mean of this distribution of true error variances given an ensemble sample variance from (observation-minus-forecast, ensemble-variance) data pairs produced by a single run of a data assimilation system. This formula takes the form of a Hybrid weighted average of the climatological forecast error variance and the ensemble sample variance. Here, we test the extent to which these readily obtainable weights can be used to rapidly optimize the covariance weights used in Hybrid data assimilation systems that employ weighted averages of static covariance models and flow-dependent ensemble based covariance models. Univariate data assimilation and multi-variate cycling ensemble data assimilation are considered. In both cases, it is found that our computationally efficient formula gives Hybrid weights that closely approximate the optimal weights found through the simple but computationally expensive process of testing every plausible combination of weights.
Genetic parameters of legendre polynomials for first parity lactation curves.
Pool, M H; Janss, L L; Meuwissen, T H
2000-11-01
Variance components of the covariance function coefficients in a random regression test-day model were estimated by Legendre polynomials up to a fifth order for first-parity records of Dutch dairy cows using Gibbs sampling. Two Legendre polynomials of equal order were used to model the random part of the lactation curve, one for the genetic component and one for permanent environment. Test-day records from cows registered between 1990 to 1996 and collected by regular milk recording were available. For the data set, 23,700 complete lactations were selected from 475 herds sired by 262 sires. Because the application of a random regression model is limited by computing capacity, we investigated the minimum order needed to fit the variance structure in the data sufficiently. Predictions of genetic and permanent environmental variance structures were compared with bivariate estimates on 30-d intervals. A third-order or higher polynomial modeled the shape of variance curves over DIM with sufficient accuracy for the genetic and permanent environment part. Also, the genetic correlation structure was fitted with sufficient accuracy by a third-order polynomial, but, for the permanent environmental component, a fourth order was needed. Because equal orders are suggested in the literature, a fourth-order Legendre polynomial is recommended in this study. However, a rank of three for the genetic covariance matrix and of four for permanent environment allows a simpler covariance function with a reduced number of parameters based on the eigenvalues and eigenvectors.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Optical tomographic detection of rheumatoid arthritis with computer-aided classification schemes
NASA Astrophysics Data System (ADS)
Klose, Christian D.; Klose, Alexander D.; Netz, Uwe; Beuthan, Jürgen; Hielscher, Andreas H.
2009-02-01
A recent research study has shown that combining multiple parameters, drawn from optical tomographic images, leads to better classification results to identifying human finger joints that are affected or not affected by rheumatic arthritis RA. Building up on the research findings of the previous study, this article presents an advanced computer-aided classification approach for interpreting optical image data to detect RA in finger joints. Additional data are used including, for example, maximum and minimum values of the absorption coefficient as well as their ratios and image variances. Classification performances obtained by the proposed method were evaluated in terms of sensitivity, specificity, Youden index and area under the curve AUC. Results were compared to different benchmarks ("gold standard"): magnet resonance, ultrasound and clinical evaluation. Maximum accuracies (AUC=0.88) were reached when combining minimum/maximum-ratios and image variances and using ultrasound as gold standard.
Noise and drift analysis of non-equally spaced timing data
NASA Technical Reports Server (NTRS)
Vernotte, F.; Zalamansky, G.; Lantz, E.
1994-01-01
Generally, it is possible to obtain equally spaced timing data from oscillators. The measurement of the drifts and noises affecting oscillators is then performed by using a variance (Allan variance, modified Allan variance, or time variance) or a system of several variances (multivariance method). However, in some cases, several samples, or even several sets of samples, are missing. In the case of millisecond pulsar timing data, for instance, observations are quite irregularly spaced in time. Nevertheless, since some observations are very close together (one minute) and since the timing data sequence is very long (more than ten years), information on both short-term and long-term stability is available. Unfortunately, a direct variance analysis is not possible without interpolating missing data. Different interpolation algorithms (linear interpolation, cubic spline) are used to calculate variances in order to verify that they neither lose information nor add erroneous information. A comparison of the results of the different algorithms is given. Finally, the multivariance method was adapted to the measurement sequence of the millisecond pulsar timing data: the responses of each variance of the system are calculated for each type of noise and drift, with the same missing samples as in the pulsar timing sequence. An estimation of precision, dynamics, and separability of this method is given.
Mutch, Sarah A.; Gadd, Jennifer C.; Fujimoto, Bryant S.; Kensel-Hammes, Patricia; Schiro, Perry G.; Bajjalieh, Sandra M.; Chiu, Daniel T.
2013-01-01
This protocol describes a method to determine both the average number and variance of proteins in the few to tens of copies in isolated cellular compartments, such as organelles and protein complexes. Other currently available protein quantification techniques either provide an average number but lack information on the variance or are not suitable for reliably counting proteins present in the few to tens of copies. This protocol entails labeling the cellular compartment with fluorescent primary-secondary antibody complexes, TIRF (total internal reflection fluorescence) microscopy imaging of the cellular compartment, digital image analysis, and deconvolution of the fluorescence intensity data. A minimum of 2.5 days is required to complete the labeling, imaging, and analysis of a set of samples. As an illustrative example, we describe in detail the procedure used to determine the copy number of proteins in synaptic vesicles. The same procedure can be applied to other organelles or signaling complexes. PMID:22094731
Static vs stochastic optimization: A case study of FTSE Bursa Malaysia sectorial indices
NASA Astrophysics Data System (ADS)
Mamat, Nur Jumaadzan Zaleha; Jaaman, Saiful Hafizah; Ahmad, Rokiah@Rozita
2014-06-01
Traditional portfolio optimization methods in the likes of Markowitz' mean-variance model and semi-variance model utilize static expected return and volatility risk from historical data to generate an optimal portfolio. The optimal portfolio may not truly be optimal in reality due to the fact that maximum and minimum values from the data may largely influence the expected return and volatility risk values. This paper considers distributions of assets' return and volatility risk to determine a more realistic optimized portfolio. For illustration purposes, the sectorial indices data in FTSE Bursa Malaysia is employed. The results show that stochastic optimization provides more stable information ratio.
NASA Astrophysics Data System (ADS)
Haji Heidari, Mehdi; Mozaffarzadeh, Moein; Manwar, Rayyan; Nasiriavanaki, Mohammadreza
2018-02-01
In recent years, the minimum variance (MV) beamforming has been widely studied due to its high resolution and contrast in B-mode Ultrasound imaging (USI). However, the performance of the MV beamformer is degraded at the presence of noise, as a result of the inaccurate covariance matrix estimation which leads to a low quality image. Second harmonic imaging (SHI) provides many advantages over the conventional pulse-echo USI, such as enhanced axial and lateral resolutions. However, the low signal-to-noise ratio (SNR) is a major problem in SHI. In this paper, Eigenspace-based minimum variance (EIBMV) beamformer has been employed for second harmonic USI. The Tissue Harmonic Imaging (THI) is achieved by Pulse Inversion (PI) technique. Using the EIBMV weights, instead of the MV ones, would lead to reduced sidelobes and improved contrast, without compromising the high resolution of the MV beamformer (even at the presence of a strong noise). In addition, we have investigated the effects of variations of the important parameters in computing EIBMV weights, i.e., K, L, and δ, on the resolution and contrast obtained in SHI. The results are evaluated using numerical data (using point target and cyst phantoms), and the proper parameters of EIBMV are indicated for THI.
Hydraulic geometry of river cross sections; theory of minimum variance
Williams, Garnett P.
1978-01-01
This study deals with the rates at which mean velocity, mean depth, and water-surface width increase with water discharge at a cross section on an alluvial stream. Such relations often follow power laws, the exponents in which are called hydraulic exponents. The Langbein (1964) minimum-variance theory is examined in regard to its validity and its ability to predict observed hydraulic exponents. The variables used with the theory were velocity, depth, width, bed shear stress, friction factor, slope (energy gradient), and stream power. Slope is often constant, in which case only velocity, depth, width, shear and friction factor need be considered. The theory was tested against a wide range of field data from various geographic areas of the United States. The original theory was intended to produce only the average hydraulic exponents for a group of cross sections in a similar type of geologic or hydraulic environment. The theory does predict these average exponents with a reasonable degree of accuracy. An attempt to forecast the exponents at any selected cross section was moderately successful. Empirical equations are more accurate than the minimum variance, Gauckler-Manning, or Chezy methods. Predictions of the exponent of width are most reliable, the exponent of depth fair, and the exponent of mean velocity poor. (Woodard-USGS)
Feasibility Study for Design of a Biocybernetic Communication System
1975-08-01
electrode for the Within Words variance and Between Words variance for each of the 255 data samples in the 6-sec epoch. If a given sample point was not...contributing to the computer classification of the word, the ratio of the two variances (i.e., the F-statistic) should be small. On the other hand...if the Between Word variance was signifi- cantly higher than the Within Word variance for a given sample point, we can assume with some confidence
Multi-Sensor Optimal Data Fusion Based on the Adaptive Fading Unscented Kalman Filter
Gao, Bingbing; Hu, Gaoge; Gao, Shesheng; Gu, Chengfan
2018-01-01
This paper presents a new optimal data fusion methodology based on the adaptive fading unscented Kalman filter for multi-sensor nonlinear stochastic systems. This methodology has a two-level fusion structure: at the bottom level, an adaptive fading unscented Kalman filter based on the Mahalanobis distance is developed and serves as local filters to improve the adaptability and robustness of local state estimations against process-modeling error; at the top level, an unscented transformation-based multi-sensor optimal data fusion for the case of N local filters is established according to the principle of linear minimum variance to calculate globally optimal state estimation by fusion of local estimations. The proposed methodology effectively refrains from the influence of process-modeling error on the fusion solution, leading to improved adaptability and robustness of data fusion for multi-sensor nonlinear stochastic systems. It also achieves globally optimal fusion results based on the principle of linear minimum variance. Simulation and experimental results demonstrate the efficacy of the proposed methodology for INS/GNSS/CNS (inertial navigation system/global navigation satellite system/celestial navigation system) integrated navigation. PMID:29415509
Multi-Sensor Optimal Data Fusion Based on the Adaptive Fading Unscented Kalman Filter.
Gao, Bingbing; Hu, Gaoge; Gao, Shesheng; Zhong, Yongmin; Gu, Chengfan
2018-02-06
This paper presents a new optimal data fusion methodology based on the adaptive fading unscented Kalman filter for multi-sensor nonlinear stochastic systems. This methodology has a two-level fusion structure: at the bottom level, an adaptive fading unscented Kalman filter based on the Mahalanobis distance is developed and serves as local filters to improve the adaptability and robustness of local state estimations against process-modeling error; at the top level, an unscented transformation-based multi-sensor optimal data fusion for the case of N local filters is established according to the principle of linear minimum variance to calculate globally optimal state estimation by fusion of local estimations. The proposed methodology effectively refrains from the influence of process-modeling error on the fusion solution, leading to improved adaptability and robustness of data fusion for multi-sensor nonlinear stochastic systems. It also achieves globally optimal fusion results based on the principle of linear minimum variance. Simulation and experimental results demonstrate the efficacy of the proposed methodology for INS/GNSS/CNS (inertial navigation system/global navigation satellite system/celestial navigation system) integrated navigation.
Joint Adaptive Mean-Variance Regularization and Variance Stabilization of High Dimensional Data.
Dazard, Jean-Eudes; Rao, J Sunil
2012-07-01
The paper addresses a common problem in the analysis of high-dimensional high-throughput "omics" data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel "similarity statistic"-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called 'MVR' ('Mean-Variance Regularization'), downloadable from the CRAN website.
Joint Adaptive Mean-Variance Regularization and Variance Stabilization of High Dimensional Data
Dazard, Jean-Eudes; Rao, J. Sunil
2012-01-01
The paper addresses a common problem in the analysis of high-dimensional high-throughput “omics” data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel “similarity statistic”-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called ‘MVR’ (‘Mean-Variance Regularization’), downloadable from the CRAN website. PMID:22711950
ERIC Educational Resources Information Center
Stapleton, Laura M.
2008-01-01
This article discusses replication sampling variance estimation techniques that are often applied in analyses using data from complex sampling designs: jackknife repeated replication, balanced repeated replication, and bootstrapping. These techniques are used with traditional analyses such as regression, but are currently not used with structural…
flowVS: channel-specific variance stabilization in flow cytometry.
Azad, Ariful; Rajwa, Bartek; Pothen, Alex
2016-07-28
Comparing phenotypes of heterogeneous cell populations from multiple biological conditions is at the heart of scientific discovery based on flow cytometry (FC). When the biological signal is measured by the average expression of a biomarker, standard statistical methods require that variance be approximately stabilized in populations to be compared. Since the mean and variance of a cell population are often correlated in fluorescence-based FC measurements, a preprocessing step is needed to stabilize the within-population variances. We present a variance-stabilization algorithm, called flowVS, that removes the mean-variance correlations from cell populations identified in each fluorescence channel. flowVS transforms each channel from all samples of a data set by the inverse hyperbolic sine (asinh) transformation. For each channel, the parameters of the transformation are optimally selected by Bartlett's likelihood-ratio test so that the populations attain homogeneous variances. The optimum parameters are then used to transform the corresponding channels in every sample. flowVS is therefore an explicit variance-stabilization method that stabilizes within-population variances in each channel by evaluating the homoskedasticity of clusters with a likelihood-ratio test. With two publicly available datasets, we show that flowVS removes the mean-variance dependence from raw FC data and makes the within-population variance relatively homogeneous. We demonstrate that alternative transformation techniques such as flowTrans, flowScape, logicle, and FCSTrans might not stabilize variance. Besides flow cytometry, flowVS can also be applied to stabilize variance in microarray data. With a publicly available data set we demonstrate that flowVS performs as well as the VSN software, a state-of-the-art approach developed for microarrays. The homogeneity of variance in cell populations across FC samples is desirable when extracting features uniformly and comparing cell populations with different levels of marker expressions. The newly developed flowVS algorithm solves the variance-stabilization problem in FC and microarrays by optimally transforming data with the help of Bartlett's likelihood-ratio test. On two publicly available FC datasets, flowVS stabilizes within-population variances more evenly than the available transformation and normalization techniques. flowVS-based variance stabilization can help in performing comparison and alignment of phenotypically identical cell populations across different samples. flowVS and the datasets used in this paper are publicly available in Bioconductor.
NASA Astrophysics Data System (ADS)
Kohán, Balázs; Tyler, Jonathan; Jones, Matthew; Kern, Zoltán
2017-04-01
Water stable isotopes are important natural tracers in the hydrological cycle on global, regional and local scales. Daily precipitation water samples were collected from 70 sites over the British Isles on the 23rd, 24th, and 25th January, 2012 [1]. Samples were collected as part of a pilot study for the British Isotopes in Rainfall Project, a community engagement initiative, in collaboration with volunteer weather observers and the UK Met Office. Spatial correlation structure of daily precipitation stable oxygen isotope composition (δ18OP) has been explored by variogram analysis [2]. Since the variograms from the raw data suggested a pronounced trend, owing to the spatial trend discussed in the original study [1], a second order polynomial trend was removed from the raw δ18OP data and variograms were calculated on the residuals. Directional experimental semivariograms were calculated (steps: 10°, tolerance: 30°) and aggregated into variogram surface plots to explore the spatial dependence structure of daily δ18OP. Each daily data set produced distinct variogram plots. -A well expressed anisotropic structure can be seen for Jan 23. The lowest and highest variance was observed in the SW-NE and NNE-SSW direction, respectively. Meteorological observations showed that the majority of the atmospheric flow was SW on this day, so the direction of low variance seems to reflect this flow direction, while the maximum variance might reflect the moisture variance near the elongation of the frontal system. -A less characteristic but still expressed anisotropic structure was found for Jan 24 when a warm front passed the British Isles perpendicular to the east coast, leading to a characteristic east-west δ18OP gradient suggestive of progressive rainout. The low variance central zone has a 100 km radius which might correspond well to the width of the warm front zone. Although, the axis of minimum variance was similarly SW-NE, the zone of maximum variance was broader and practically perpendicular to it. In this case, however, directions of the axes appear misaligned with the flow direction. -We could not observe similar characteristic patterns in the last variogram calculated from the Jan 25 data set. These preliminary results suggest that variogram analysis is a promising approach to link δ18OP patterns to atmospheric processes. NKFIH: SNN118205/ARRS: N1-0054 References 1.Tyler, J. J., Jones, M., Arrowsmith, C., Allott, T., & Leng, M. J. (2016). Spatial patterns in the oxygen isotope composition of daily rainfall in the British Isles. Climate Dynamics 47:1971-1987 2.Webster, R. Oliver M.A. (2007) Geostatistics for Environmental Scientists. John Wiley & Sons, Chichester
Darbani, Behrooz; Stewart, C Neal; Noeparvar, Shahin; Borg, Søren
2014-10-20
This report investigates for the first time the potential inter-treatment bias source of cell number for gene expression studies. Cell-number bias can affect gene expression analysis when comparing samples with unequal total cellular RNA content or with different RNA extraction efficiencies. For maximal reliability of analysis, therefore, comparisons should be performed at the cellular level. This could be accomplished using an appropriate correction method that can detect and remove the inter-treatment bias for cell-number. Based on inter-treatment variations of reference genes, we introduce an analytical approach to examine the suitability of correction methods by considering the inter-treatment bias as well as the inter-replicate variance, which allows use of the best correction method with minimum residual bias. Analyses of RNA sequencing and microarray data showed that the efficiencies of correction methods are influenced by the inter-treatment bias as well as the inter-replicate variance. Therefore, we recommend inspecting both of the bias sources in order to apply the most efficient correction method. As an alternative correction strategy, sequential application of different correction approaches is also advised. Copyright © 2014 Elsevier B.V. All rights reserved.
Refining a case-mix measure for nursing homes: Resource Utilization Groups (RUG-III).
Fries, B E; Schneider, D P; Foley, W J; Gavazzi, M; Burke, R; Cornelius, E
1994-07-01
A case-mix classification system for nursing home residents is developed, based on a sample of 7,658 residents in seven states. Data included a broad assessment of resident characteristics, corresponding to items of the Minimum Data Set, and detailed measurement of nursing staff care time over a 24-hour period and therapy staff time over a 1-week period. The Resource Utilization Groups, Version III (RUG-III) system, with 44 distinct groups, achieves 55.5% variance explanation of total (nursing and therapy) per diem cost and meets goals of clinical validity and payment incentives. The mean resource use (case-mix index) of groups spans a nine-fold range. The RUG-III system improves on an earlier version not only by increasing the variance explanation (from 43%), but, more importantly, by identifying residents with "high tech" procedures (e.g., ventilators, respirators, and parenteral feeding) and those with cognitive impairments; by using better multiple activities of daily living; and by providing explicit qualifications for the Medicare nursing home benefit. RUG-III is being implemented for nursing home payment in 11 states (six as part of a federal multistate demonstration) and can be used in management, staffing level determination, and quality assurance.
Sample size considerations for clinical research studies in nuclear cardiology.
Chiuzan, Cody; West, Erin A; Duong, Jimmy; Cheung, Ken Y K; Einstein, Andrew J
2015-12-01
Sample size calculation is an important element of research design that investigators need to consider in the planning stage of the study. Funding agencies and research review panels request a power analysis, for example, to determine the minimum number of subjects needed for an experiment to be informative. Calculating the right sample size is crucial to gaining accurate information and ensures that research resources are used efficiently and ethically. The simple question "How many subjects do I need?" does not always have a simple answer. Before calculating the sample size requirements, a researcher must address several aspects, such as purpose of the research (descriptive or comparative), type of samples (one or more groups), and data being collected (continuous or categorical). In this article, we describe some of the most frequent methods for calculating the sample size with examples from nuclear cardiology research, including for t tests, analysis of variance (ANOVA), non-parametric tests, correlation, Chi-squared tests, and survival analysis. For the ease of implementation, several examples are also illustrated via user-friendly free statistical software.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Static vs stochastic optimization: A case study of FTSE Bursa Malaysia sectorial indices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mamat, Nur Jumaadzan Zaleha; Jaaman, Saiful Hafizah; Ahmad, Rokiah Rozita
2014-06-19
Traditional portfolio optimization methods in the likes of Markowitz' mean-variance model and semi-variance model utilize static expected return and volatility risk from historical data to generate an optimal portfolio. The optimal portfolio may not truly be optimal in reality due to the fact that maximum and minimum values from the data may largely influence the expected return and volatility risk values. This paper considers distributions of assets' return and volatility risk to determine a more realistic optimized portfolio. For illustration purposes, the sectorial indices data in FTSE Bursa Malaysia is employed. The results show that stochastic optimization provides more stablemore » information ratio.« less
Estimation of distribution overlap of urn models.
Hampton, Jerrad; Lladser, Manuel E
2012-01-01
A classical problem in statistics is estimating the expected coverage of a sample, which has had applications in gene expression, microbial ecology, optimization, and even numismatics. Here we consider a related extension of this problem to random samples of two discrete distributions. Specifically, we estimate what we call the dissimilarity probability of a sample, i.e., the probability of a draw from one distribution not being observed in [Formula: see text] draws from another distribution. We show our estimator of dissimilarity to be a [Formula: see text]-statistic and a uniformly minimum variance unbiased estimator of dissimilarity over the largest appropriate range of [Formula: see text]. Furthermore, despite the non-Markovian nature of our estimator when applied sequentially over [Formula: see text], we show it converges uniformly in probability to the dissimilarity parameter, and we present criteria when it is approximately normally distributed and admits a consistent jackknife estimator of its variance. As proof of concept, we analyze V35 16S rRNA data to discern between various microbial environments. Other potential applications concern any situation where dissimilarity of two discrete distributions may be of interest. For instance, in SELEX experiments, each urn could represent a random RNA pool and each draw a possible solution to a particular binding site problem over that pool. The dissimilarity of these pools is then related to the probability of finding binding site solutions in one pool that are absent in the other.
Correcting for Systematic Bias in Sample Estimates of Population Variances: Why Do We Divide by n-1?
ERIC Educational Resources Information Center
Mittag, Kathleen Cage
An important topic presented in introductory statistics courses is the estimation of population parameters using samples. Students learn that when estimating population variances using sample data, we always get an underestimate of the population variance if we divide by n rather than n-1. One implication of this correction is that the degree of…
A test of source-surface model predictions of heliospheric current sheet inclination
NASA Technical Reports Server (NTRS)
Burton, M. E.; Crooker, N. U.; Siscoe, G. L.; Smith, E. J.
1994-01-01
The orientation of the heliospheric current sheet predicted from a source surface model is compared with the orientation determined from minimum-variance analysis of International Sun-Earth Explorer (ISEE) 3 magnetic field data at 1 AU near solar maximum. Of the 37 cases analyzed, 28 have minimum variance normals that lie orthogonal to the predicted Parker spiral direction. For these cases, the correlation coefficient between the predicted and measured inclinations is 0.6. However, for the subset of 14 cases for which transient signatures (either interplanetary shocks or bidirectional electrons) are absent, the agreement in inclinations improves dramatically, with a correlation coefficient of 0.96. These results validate not only the use of the source surface model as a predictor but also the previously questioned usefulness of minimum variance analysis across complex sector boundaries. In addition, the results imply that interplanetary dynamics have little effect on current sheet inclination at 1 AU. The dependence of the correlation on transient occurrence suggests that the leading edge of a coronal mass ejection (CME), where transient signatures are detected, disrupts the heliospheric current sheet but that the sheet re-forms between the trailing legs of the CME. In this way the global structure of the heliosphere, reflected both in the source surface maps and in the interplanetary sector structure, can be maintained even when the CME occurrence rate is high.
2dFLenS and KiDS: determining source redshift distributions with cross-correlations
NASA Astrophysics Data System (ADS)
Johnson, Andrew; Blake, Chris; Amon, Alexandra; Erben, Thomas; Glazebrook, Karl; Harnois-Deraps, Joachim; Heymans, Catherine; Hildebrandt, Hendrik; Joudaki, Shahab; Klaes, Dominik; Kuijken, Konrad; Lidman, Chris; Marin, Felipe A.; McFarland, John; Morrison, Christopher B.; Parkinson, David; Poole, Gregory B.; Radovich, Mario; Wolf, Christian
2017-03-01
We develop a statistical estimator to infer the redshift probability distribution of a photometric sample of galaxies from its angular cross-correlation in redshift bins with an overlapping spectroscopic sample. This estimator is a minimum-variance weighted quadratic function of the data: a quadratic estimator. This extends and modifies the methodology presented by McQuinn & White. The derived source redshift distribution is degenerate with the source galaxy bias, which must be constrained via additional assumptions. We apply this estimator to constrain source galaxy redshift distributions in the Kilo-Degree imaging survey through cross-correlation with the spectroscopic 2-degree Field Lensing Survey, presenting results first as a binned step-wise distribution in the range z < 0.8, and then building a continuous distribution using a Gaussian process model. We demonstrate the robustness of our methodology using mock catalogues constructed from N-body simulations, and comparisons with other techniques for inferring the redshift distribution.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-07
... drought-based temporary variance of the Martin Project rule curve and minimum flow releases at the Yates... requesting a drought- based temporary variance to the Martin Project rule curve. The rule curve variance...
Software for the grouped optimal aggregation technique
NASA Technical Reports Server (NTRS)
Brown, P. M.; Shaw, G. W. (Principal Investigator)
1982-01-01
The grouped optimal aggregation technique produces minimum variance, unbiased estimates of acreage and production for countries, zones (states), or any designated collection of acreage strata. It uses yield predictions, historical acreage information, and direct acreage estimate from satellite data. The acreage strata are grouped in such a way that the ratio model over historical acreage provides a smaller variance than if the model were applied to each individual stratum. An optimal weighting matrix based on historical acreages, provides the link between incomplete direct acreage estimates and the total, current acreage estimate.
Estimating fluvial wood discharge from timelapse photography with varying sampling intervals
NASA Astrophysics Data System (ADS)
Anderson, N. K.
2013-12-01
There is recent focus on calculating wood budgets for streams and rivers to help inform management decisions, ecological studies and carbon/nutrient cycling models. Most work has measured in situ wood in temporary storage along stream banks or estimated wood inputs from banks. Little effort has been employed monitoring and quantifying wood in transport during high flows. This paper outlines a procedure for estimating total seasonal wood loads using non-continuous coarse interval sampling and examines differences in estimation between sampling at 1, 5, 10 and 15 minutes. Analysis is performed on wood transport for the Slave River in Northwest Territories, Canada. Relative to the 1 minute dataset, precision decreased by 23%, 46% and 60% for the 5, 10 and 15 minute datasets, respectively. Five and 10 minute sampling intervals provided unbiased equal variance estimates of 1 minute sampling, whereas 15 minute intervals were biased towards underestimation by 6%. Stratifying estimates by day and by discharge increased precision over non-stratification by 4% and 3%, respectively. Not including wood transported during ice break-up, the total minimum wood load estimated at this site is 3300 × 800$ m3 for the 2012 runoff season. The vast majority of the imprecision in total wood volumes came from variance in estimating average volume per log. Comparison of proportions and variance across sample intervals using bootstrap sampling to achieve equal n. Each trial was sampled for n=100, 10,000 times and averaged. All trials were then averaged to obtain an estimate for each sample interval. Dashed lines represent values from the one minute dataset.
Wonnapinij, Passorn; Chinnery, Patrick F.; Samuels, David C.
2010-01-01
In cases of inherited pathogenic mitochondrial DNA (mtDNA) mutations, a mother and her offspring generally have large and seemingly random differences in the amount of mutated mtDNA that they carry. Comparisons of measured mtDNA mutation level variance values have become an important issue in determining the mechanisms that cause these large random shifts in mutation level. These variance measurements have been made with samples of quite modest size, which should be a source of concern because higher-order statistics, such as variance, are poorly estimated from small sample sizes. We have developed an analysis of the standard error of variance from a sample of size n, and we have defined error bars for variance measurements based on this standard error. We calculate variance error bars for several published sets of measurements of mtDNA mutation level variance and show how the addition of the error bars alters the interpretation of these experimental results. We compare variance measurements from human clinical data and from mouse models and show that the mutation level variance is clearly higher in the human data than it is in the mouse models at both the primary oocyte and offspring stages of inheritance. We discuss how the standard error of variance can be used in the design of experiments measuring mtDNA mutation level variance. Our results show that variance measurements based on fewer than 20 measurements are generally unreliable and ideally more than 50 measurements are required to reliably compare variances with less than a 2-fold difference. PMID:20362273
Meta-analysis with missing study-level sample variance data.
Chowdhry, Amit K; Dworkin, Robert H; McDermott, Michael P
2016-07-30
We consider a study-level meta-analysis with a normally distributed outcome variable and possibly unequal study-level variances, where the object of inference is the difference in means between a treatment and control group. A common complication in such an analysis is missing sample variances for some studies. A frequently used approach is to impute the weighted (by sample size) mean of the observed variances (mean imputation). Another approach is to include only those studies with variances reported (complete case analysis). Both mean imputation and complete case analysis are only valid under the missing-completely-at-random assumption, and even then the inverse variance weights produced are not necessarily optimal. We propose a multiple imputation method employing gamma meta-regression to impute the missing sample variances. Our method takes advantage of study-level covariates that may be used to provide information about the missing data. Through simulation studies, we show that multiple imputation, when the imputation model is correctly specified, is superior to competing methods in terms of confidence interval coverage probability and type I error probability when testing a specified group difference. Finally, we describe a similar approach to handling missing variances in cross-over studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
An Analysis of Variance Framework for Matrix Sampling.
ERIC Educational Resources Information Center
Sirotnik, Kenneth
Significant cost savings can be achieved with the use of matrix sampling in estimating population parameters from psychometric data. The statistical design is intuitively simple, using the framework of the two-way classification analysis of variance technique. For example, the mean and variance are derived from the performance of a certain grade…
Charged particle tracking at Titan, and further applications
NASA Astrophysics Data System (ADS)
Bebesi, Zsofia; Erdos, Geza; Szego, Karoly
2016-04-01
We use the CAPS ion data of Cassini to investigate the dynamics and origin of Titan's atmospheric ions. We developed a 4th order Runge-Kutta method to calculate particle trajectories in a time reversed scenario. The test particle magnetic field environment imitates the curved magnetic environment in the vicinity of Titan. The minimum variance directions along the S/C trajectory have been calculated for all available Titan flybys, and we assumed a homogeneous field that is perpendicular to the minimum variance direction. Using this method the magnetic field lines have been calculated along the flyby orbits so we could select those observational intervals when Cassini and the upper atmosphere of Titan were magnetically connected. We have also taken the Kronian magnetodisc into consideration, and used different upstream magnetic field approximations depending on whether Titan was located inside of the magnetodisc current sheet, or in the lobe regions. We also discuss the code's applicability to comets.
Microstructure of the IMF turbulences at 2.5 AU
NASA Technical Reports Server (NTRS)
Mavromichalaki, H.; Vassilaki, A.; Marmatsouri, L.; Moussas, X.; Quenby, J. J.; Smith, E. J.
1995-01-01
A detailed analysis of small period (15-900 sec) magnetohydrodynamic (MHD) turbulences of the interplanetary magnetic field (IMF) has been made using Pioneer-11 high time resolution data (0.75 sec) inside a Corotating Interaction Region (CIR) at a heliocentric distance of 2.5 AU in 1973. The methods used are the hodogram analysis, the minimum variance matrix analysis and the cohenrence analysis. The minimum variance analysis gives evidence of linear polarized wave modes. Coherence analysis has shown that the field fluctuations are dominated by the magnetosonic fast modes with periods 15 sec to 15 min. However, it is also shown that some small amplitude Alfven waves are present in the trailing edge of this region with characteristic periods (15-200 sec). The observed wave modes are locally generated and possibly attributed to the scattering of Alfven waves energy into random magnetosonic waves.
tscvh R Package: Computational of the two samples test on microarray-sequencing data
NASA Astrophysics Data System (ADS)
Fajriyah, Rohmatul; Rosadi, Dedi
2017-12-01
We present a new R package, a tscvh (two samples cross-variance homogeneity), as we called it. This package is a software of the cross-variance statistical test which has been proposed and introduced by Fajriyah ([3] and [4]), based on the cross-variance concept. The test can be used as an alternative test for the significance difference between two means when sample size is small, the situation which is usually appeared in the bioinformatics research. Based on its statistical distribution, the p-value can be also provided. The package is built under a homogeneity of variance between samples.
A two-step sensitivity analysis for hydrological signatures in Jinhua River Basin, East China
NASA Astrophysics Data System (ADS)
Pan, S.; Fu, G.; Chiang, Y. M.; Xu, Y. P.
2016-12-01
Owing to model complexity and large number of parameters, calibration and sensitivity analysis are difficult processes for distributed hydrological models. In this study, a two-step sensitivity analysis approach is proposed for analyzing the hydrological signatures in Jinhua River Basin, East China, using the Distributed Hydrology-Soil-Vegetation Model (DHSVM). A rough sensitivity analysis is firstly conducted to obtain preliminary influential parameters via Analysis of Variance. The number of parameters was greatly reduced from eighteen-three to sixteen. Afterwards, the sixteen parameters are further analyzed based on a variance-based global sensitivity analysis, i.e., Sobol's sensitivity analysis method, to achieve robust sensitivity rankings and parameter contributions. Parallel-Computing is applied to reduce computational burden in variance-based sensitivity analysis. The results reveal that only a few number of model parameters are significantly sensitive, including rain LAI multiplier, lateral conductivity, porosity, field capacity, wilting point of clay loam, understory monthly LAI, understory minimum resistance and root zone depths of croplands. Finally several hydrological signatures are used for investigating the performance of DHSVM. Results show that high value of efficiency criteria didn't indicate excellent performance of hydrological signatures. For most samples from Sobol's sensitivity analysis, water yield was simulated very well. However, lowest and maximum annual daily runoffs were underestimated. Most of seven-day minimum runoffs were overestimated. Nevertheless, good performances of the three signatures above still exist in a number of samples. Analysis of peak flow shows that small and medium floods are simulated perfectly while slight underestimations happen to large floods. The work in this study helps to further multi-objective calibration of DHSVM model and indicates where to improve the reliability and credibility of model simulation.
Robust linear discriminant analysis with distance based estimators
NASA Astrophysics Data System (ADS)
Lim, Yai-Fung; Yahaya, Sharipah Soaad Syed; Ali, Hazlina
2017-11-01
Linear discriminant analysis (LDA) is one of the supervised classification techniques concerning relationship between a categorical variable and a set of continuous variables. The main objective of LDA is to create a function to distinguish between populations and allocating future observations to previously defined populations. Under the assumptions of normality and homoscedasticity, the LDA yields optimal linear discriminant rule (LDR) between two or more groups. However, the optimality of LDA highly relies on the sample mean and pooled sample covariance matrix which are known to be sensitive to outliers. To alleviate these conflicts, a new robust LDA using distance based estimators known as minimum variance vector (MVV) has been proposed in this study. The MVV estimators were used to substitute the classical sample mean and classical sample covariance to form a robust linear discriminant rule (RLDR). Simulation and real data study were conducted to examine on the performance of the proposed RLDR measured in terms of misclassification error rates. The computational result showed that the proposed RLDR is better than the classical LDR and was comparable with the existing robust LDR.
Efficiently estimating salmon escapement uncertainty using systematically sampled data
Reynolds, Joel H.; Woody, Carol Ann; Gove, Nancy E.; Fair, Lowell F.
2007-01-01
Fish escapement is generally monitored using nonreplicated systematic sampling designs (e.g., via visual counts from towers or hydroacoustic counts). These sampling designs support a variety of methods for estimating the variance of the total escapement. Unfortunately, all the methods give biased results, with the magnitude of the bias being determined by the underlying process patterns. Fish escapement commonly exhibits positive autocorrelation and nonlinear patterns, such as diurnal and seasonal patterns. For these patterns, poor choice of variance estimator can needlessly increase the uncertainty managers have to deal with in sustaining fish populations. We illustrate the effect of sampling design and variance estimator choice on variance estimates of total escapement for anadromous salmonids from systematic samples of fish passage. Using simulated tower counts of sockeye salmon Oncorhynchus nerka escapement on the Kvichak River, Alaska, five variance estimators for nonreplicated systematic samples were compared to determine the least biased. Using the least biased variance estimator, four confidence interval estimators were compared for expected coverage and mean interval width. Finally, five systematic sampling designs were compared to determine the design giving the smallest average variance estimate for total annual escapement. For nonreplicated systematic samples of fish escapement, all variance estimators were positively biased. Compared to the other estimators, the least biased estimator reduced bias by, on average, from 12% to 98%. All confidence intervals gave effectively identical results. Replicated systematic sampling designs consistently provided the smallest average estimated variance among those compared.
Minimum-variance Brownian motion control of an optically trapped probe.
Huang, Yanan; Zhang, Zhipeng; Menq, Chia-Hsiang
2009-10-20
This paper presents a theoretical and experimental investigation of the Brownian motion control of an optically trapped probe. The Langevin equation is employed to describe the motion of the probe experiencing random thermal force and optical trapping force. Since active feedback control is applied to suppress the probe's Brownian motion, actuator dynamics and measurement delay are included in the equation. The equation of motion is simplified to a first-order linear differential equation and transformed to a discrete model for the purpose of controller design and data analysis. The derived model is experimentally verified by comparing the model prediction to the measured response of a 1.87 microm trapped probe subject to proportional control. It is then employed to design the optimal controller that minimizes the variance of the probe's Brownian motion. Theoretical analysis is derived to evaluate the control performance of a specific optical trap. Both experiment and simulation are used to validate the design as well as theoretical analysis, and to illustrate the performance envelope of the active control. Moreover, adaptive minimum variance control is implemented to maintain the optimal performance in the case in which the system is time varying when operating the actively controlled optical trap in a complex environment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aziz, Mohd Khairul Bazli Mohd, E-mail: mkbazli@yahoo.com; Yusof, Fadhilah, E-mail: fadhilahy@utm.my; Daud, Zalina Mohd, E-mail: zalina@ic.utm.my
Recently, many rainfall network design techniques have been developed, discussed and compared by many researchers. Present day hydrological studies require higher levels of accuracy from collected data. In numerous basins, the rain gauge stations are located without clear scientific understanding. In this study, an attempt is made to redesign rain gauge network for Johor, Malaysia in order to meet the required level of accuracy preset by rainfall data users. The existing network of 84 rain gauges in Johor is optimized and redesigned into a new locations by using rainfall, humidity, solar radiation, temperature and wind speed data collected during themore » monsoon season (November - February) of 1975 until 2008. This study used the combination of geostatistics method (variance-reduction method) and simulated annealing as the algorithm of optimization during the redesigned proses. The result shows that the new rain gauge location provides minimum value of estimated variance. This shows that the combination of geostatistics method (variance-reduction method) and simulated annealing is successful in the development of the new optimum rain gauge system.« less
Identification, Characterization, and Utilization of Adult Meniscal Progenitor Cells
2017-11-01
approach including row scaling and Ward’s minimum variance method was chosen. This analysis revealed two groups of four samples each. For the selected...articular cartilage in an ovine model. Am J Sports Med. 2008;36(5):841-50. 7. Deshpande BR, Katz JN, Solomon DH, Yelin EH, Hunter DJ, Messier SP, et al...Miosge1,* 1Tissue Regeneration Work Group , Department of Prosthodontics, Medical Faculty, Georg-August-University, 37075 Goettingen, Germany 2Institute of
Sampling in freshwater environments: suspended particle traps and variability in the final data.
Barbizzi, Sabrina; Pati, Alessandra
2008-11-01
This paper reports one practical method to estimate the measurement uncertainty including sampling, derived by the approach implemented by Ramsey for soil investigations. The methodology has been applied to estimate the measurements uncertainty (sampling and analyses) of (137)Cs activity concentration (Bq kg(-1)) and total carbon content (%) in suspended particle sampling in a freshwater ecosystem. Uncertainty estimates for between locations, sampling and analysis components have been evaluated. For the considered measurands, the relative expanded measurement uncertainties are 12.3% for (137)Cs and 4.5% for total carbon. For (137)Cs, the measurement (sampling+analysis) variance gives the major contribution to the total variance, while for total carbon the spatial variance is the dominant contributor to the total variance. The limitations and advantages of this basic method are discussed.
NASA Technical Reports Server (NTRS)
Yamauchi, Yohei; Suess, Steven T.; Sakurai, Takashi
2002-01-01
Ulysses observations have shown that pressure balance structures (PBSs) are a common feature in high-latitude, fast solar wind near solar minimum. Previous studies of Ulysses/SWOOPS plasma data suggest these PBSs may be remnants of coronal polar plumes. Here we find support for this suggestion in an analysis of PBS magnetic structure. We used Ulysses magnetometer data and applied a minimum variance analysis to magnetic discontinuities in PBSs. We found that PBSs preferentially contain tangential discontinuities, as opposed to rotational discontinuities and to non-PBS regions in the solar wind. This suggests that PBSs contain structures like current sheets or plasmoids that may be associated with network activity at the base of plumes.
NASA Technical Reports Server (NTRS)
Yamauchi, Y.; Suess, Steven T.; Sakurai, T.; Whitaker, Ann F. (Technical Monitor)
2001-01-01
Ulysses observations have shown that pressure balance structures (PBSs) are a common feature in high-latitude, fast solar wind near solar minimum. Previous studies of Ulysses/SWOOPS plasma data suggest these PBSs may be remnants of coronal polar plumes. Here we find support for this suggestion in an analysis of PBS magnetic structure. We used Ulysses magnetometer data and applied a minimum variance analysis to discontinuities. We found that PBSs preferentially contain tangential discontinuities, as opposed to rotational discontinuities and to non-PBS regions in the solar wind. This suggests that PBSs contain structures like current sheets or plasmoids that may be associated with network activity at the base of plumes.
A new approach to importance sampling for the simulation of false alarms. [in radar systems
NASA Technical Reports Server (NTRS)
Lu, D.; Yao, K.
1987-01-01
In this paper a modified importance sampling technique for improving the convergence of Importance Sampling is given. By using this approach to estimate low false alarm rates in radar simulations, the number of Monte Carlo runs can be reduced significantly. For one-dimensional exponential, Weibull, and Rayleigh distributions, a uniformly minimum variance unbiased estimator is obtained. For Gaussian distribution the estimator in this approach is uniformly better than that of previously known Importance Sampling approach. For a cell averaging system, by combining this technique and group sampling, the reduction of Monte Carlo runs for a reference cell of 20 and false alarm rate of lE-6 is on the order of 170 as compared to the previously known Importance Sampling approach.
An Optimal Estimation Method to Obtain Surface Layer Turbulent Fluxes from Profile Measurements
NASA Astrophysics Data System (ADS)
Kang, D.
2015-12-01
In the absence of direct turbulence measurements, the turbulence characteristics of the atmospheric surface layer are often derived from measurements of the surface layer mean properties based on Monin-Obukhov Similarity Theory (MOST). This approach requires two levels of the ensemble mean wind, temperature, and water vapor, from which the fluxes of momentum, sensible heat, and water vapor can be obtained. When only one measurement level is available, the roughness heights and the assumed properties of the corresponding variables at the respective roughness heights are used. In practice, the temporal mean with large number of samples are used in place of the ensemble mean. However, in many situations the samples of data are taken from multiple levels. It is thus desirable to derive the boundary layer flux properties using all measurements. In this study, we used an optimal estimation approach to derive surface layer properties based on all available measurements. This approach assumes that the samples are taken from a population whose ensemble mean profile follows the MOST. An optimized estimate is obtained when the results yield a minimum cost function defined as a weighted summation of all error variance at each sample altitude. The weights are based one sample data variance and the altitude of the measurements. This method was applied to measurements in the marine atmospheric surface layer from a small boat using radiosonde on a tethered balloon where temperature and relative humidity profiles in the lowest 50 m were made repeatedly in about 30 minutes. We will present the resultant fluxes and the derived MOST mean profiles using different sets of measurements. The advantage of this method over the 'traditional' methods will be illustrated. Some limitations of this optimization method will also be discussed. Its application to quantify the effects of marine surface layer environment on radar and communication signal propagation will be shown as well.
Stratum variance estimation for sample allocation in crop surveys. [Great Plains Corridor
NASA Technical Reports Server (NTRS)
Perry, C. R., Jr.; Chhikara, R. S. (Principal Investigator)
1980-01-01
The problem of determining stratum variances needed in achieving an optimum sample allocation for crop surveys by remote sensing is investigated by considering an approach based on the concept of stratum variance as a function of the sampling unit size. A methodology using the existing and easily available information of historical crop statistics is developed for obtaining initial estimates of tratum variances. The procedure is applied to estimate stratum variances for wheat in the U.S. Great Plains and is evaluated based on the numerical results thus obtained. It is shown that the proposed technique is viable and performs satisfactorily, with the use of a conservative value for the field size and the crop statistics from the small political subdivision level, when the estimated stratum variances were compared to those obtained using the LANDSAT data.
Generalized Variance Function Applications in Forestry
James Alegria; Charles T. Scott; Charles T. Scott
1991-01-01
Adequately predicting the sampling errors of tabular data can reduce printing costs by eliminating the need to publish separate sampling error tables. Two generalized variance functions (GVFs) found in the literature and three GVFs derived for this study were evaluated for their ability to predict the sampling error of tabular forestry estimates. The recommended GVFs...
Uncertainty in Population Estimates for Endangered Animals and Improving the Recovery Process
Haines, Aaron M.; Zak, Matthew; Hammond, Katie; Scott, J. Michael; Goble, Dale D.; Rachlow, Janet L.
2013-01-01
Simple Summary The objective of our study was to evaluate the mention of uncertainty (i.e., variance) associated with population size estimates within U.S. recovery plans for endangered animals. To do this we reviewed all finalized recovery plans for listed terrestrial vertebrate species. We found that more recent recovery plans reported more estimates of population size and uncertainty. Also, bird and mammal recovery plans reported more estimates of population size and uncertainty. We recommend that updated recovery plans combine uncertainty of population size estimates with a minimum detectable difference to aid in successful recovery. Abstract United States recovery plans contain biological information for a species listed under the Endangered Species Act and specify recovery criteria to provide basis for species recovery. The objective of our study was to evaluate whether recovery plans provide uncertainty (e.g., variance) with estimates of population size. We reviewed all finalized recovery plans for listed terrestrial vertebrate species to record the following data: (1) if a current population size was given, (2) if a measure of uncertainty or variance was associated with current estimates of population size and (3) if population size was stipulated for recovery. We found that 59% of completed recovery plans specified a current population size, 14.5% specified a variance for the current population size estimate and 43% specified population size as a recovery criterion. More recent recovery plans reported more estimates of current population size, uncertainty and population size as a recovery criterion. Also, bird and mammal recovery plans reported more estimates of population size and uncertainty compared to reptiles and amphibians. We suggest the use of calculating minimum detectable differences to improve confidence when delisting endangered animals and we identified incentives for individuals to get involved in recovery planning to improve access to quantitative data. PMID:26479531
Analysis and application of minimum variance discrete time system identification
NASA Technical Reports Server (NTRS)
Kaufman, H.; Kotob, S.
1975-01-01
An on-line minimum variance parameter identifier is developed which embodies both accuracy and computational efficiency. The formulation results in a linear estimation problem with both additive and multiplicative noise. The resulting filter which utilizes both the covariance of the parameter vector itself and the covariance of the error in identification is proven to be mean square convergent and mean square consistent. The MV parameter identification scheme is then used to construct a stable state and parameter estimation algorithm.
A Bayesian sequential design with adaptive randomization for 2-sided hypothesis test.
Yu, Qingzhao; Zhu, Lin; Zhu, Han
2017-11-01
Bayesian sequential and adaptive randomization designs are gaining popularity in clinical trials thanks to their potentials to reduce the number of required participants and save resources. We propose a Bayesian sequential design with adaptive randomization rates so as to more efficiently attribute newly recruited patients to different treatment arms. In this paper, we consider 2-arm clinical trials. Patients are allocated to the 2 arms with a randomization rate to achieve minimum variance for the test statistic. Algorithms are presented to calculate the optimal randomization rate, critical values, and power for the proposed design. Sensitivity analysis is implemented to check the influence on design by changing the prior distributions. Simulation studies are applied to compare the proposed method and traditional methods in terms of power and actual sample sizes. Simulations show that, when total sample size is fixed, the proposed design can obtain greater power and/or cost smaller actual sample size than the traditional Bayesian sequential design. Finally, we apply the proposed method to a real data set and compare the results with the Bayesian sequential design without adaptive randomization in terms of sample sizes. The proposed method can further reduce required sample size. Copyright © 2017 John Wiley & Sons, Ltd.
Frequency-domain beamformers using conjugate gradient techniques for speech enhancement.
Zhao, Shengkui; Jones, Douglas L; Khoo, Suiyang; Man, Zhihong
2014-09-01
A multiple-iteration constrained conjugate gradient (MICCG) algorithm and a single-iteration constrained conjugate gradient (SICCG) algorithm are proposed to realize the widely used frequency-domain minimum-variance-distortionless-response (MVDR) beamformers and the resulting algorithms are applied to speech enhancement. The algorithms are derived based on the Lagrange method and the conjugate gradient techniques. The implementations of the algorithms avoid any form of explicit or implicit autocorrelation matrix inversion. Theoretical analysis establishes formal convergence of the algorithms. Specifically, the MICCG algorithm is developed based on a block adaptation approach and it generates a finite sequence of estimates that converge to the MVDR solution. For limited data records, the estimates of the MICCG algorithm are better than the conventional estimators and equivalent to the auxiliary vector algorithms. The SICCG algorithm is developed based on a continuous adaptation approach with a sample-by-sample updating procedure and the estimates asymptotically converge to the MVDR solution. An illustrative example using synthetic data from a uniform linear array is studied and an evaluation on real data recorded by an acoustic vector sensor array is demonstrated. Performance of the MICCG algorithm and the SICCG algorithm are compared with the state-of-the-art approaches.
NASA Astrophysics Data System (ADS)
Sudharsanan, Subramania I.; Mahalanobis, Abhijit; Sundareshan, Malur K.
1990-12-01
Discrete frequency domain design of Minimum Average Correlation Energy filters for optical pattern recognition introduces an implementational limitation of circular correlation. An alternative methodology which uses space domain computations to overcome this problem is presented. The technique is generalized to construct an improved synthetic discriminant function which satisfies the conflicting requirements of reduced noise variance and sharp correlation peaks to facilitate ease of detection. A quantitative evaluation of the performance characteristics of the new filter is conducted and is shown to compare favorably with the well known Minimum Variance Synthetic Discriminant Function and the space domain Minimum Average Correlation Energy filter, which are special cases of the present design.
Applications of active adaptive noise control to jet engines
NASA Technical Reports Server (NTRS)
Shoureshi, Rahmat; Brackney, Larry
1993-01-01
During phase 2 research on the application of active noise control to jet engines, the development of multiple-input/multiple-output (MIMO) active adaptive noise control algorithms and acoustic/controls models for turbofan engines were considered. Specific goals for this research phase included: (1) implementation of a MIMO adaptive minimum variance active noise controller; and (2) turbofan engine model development. A minimum variance control law for adaptive active noise control has been developed, simulated, and implemented for single-input/single-output (SISO) systems. Since acoustic systems tend to be distributed, multiple sensors, and actuators are more appropriate. As such, the SISO minimum variance controller was extended to the MIMO case. Simulation and experimental results are presented. A state-space model of a simplified gas turbine engine is developed using the bond graph technique. The model retains important system behavior, yet is of low enough order to be useful for controller design. Expansion of the model to include multiple stages and spools is also discussed.
Milliren, Carly E; Evans, Clare R; Richmond, Tracy K; Dunn, Erin C
2018-06-06
Recent advances in multilevel modeling allow for modeling non-hierarchical levels (e.g., youth in non-nested schools and neighborhoods) using cross-classified multilevel models (CCMM). Current practice is to cluster samples from one context (e.g., schools) and utilize the observations however they are distributed from the second context (e.g., neighborhoods). However, it is unknown whether an uneven distribution of sample size across these contexts leads to incorrect estimates of random effects in CCMMs. Using the school and neighborhood data structure in Add Health, we examined the effect of neighborhood sample size imbalance on the estimation of variance parameters in models predicting BMI. We differentially assigned students from a given school to neighborhoods within that school's catchment area using three scenarios of (im)balance. 1000 random datasets were simulated for each of five combinations of school- and neighborhood-level variance and imbalance scenarios, for a total of 15,000 simulated data sets. For each simulation, we calculated 95% CIs for the variance parameters to determine whether the true simulated variance fell within the interval. Across all simulations, the "true" school and neighborhood variance parameters were estimated 93-96% of the time. Only 5% of models failed to capture neighborhood variance; 6% failed to capture school variance. These results suggest that there is no systematic bias in the ability of CCMM to capture the true variance parameters regardless of the distribution of students across neighborhoods. Ongoing efforts to use CCMM are warranted and can proceed without concern for the sample imbalance across contexts. Copyright © 2018 Elsevier Ltd. All rights reserved.
Estimating acreage by double sampling using LANDSAT data
NASA Technical Reports Server (NTRS)
Pont, F.; Horwitz, H.; Kauth, R. (Principal Investigator)
1982-01-01
Double sampling techniques employing LANDSAT data for estimating the acreage of corn and soybeans was investigated and evaluated. The evaluation was based on estimated costs and correlations between two existing procedures having differing cost/variance characteristics, and included consideration of their individual merits when coupled with a fictional 'perfect' procedure of zero bias and variance. Two features of the analysis are: (1) the simultaneous estimation of two or more crops; and (2) the imposition of linear cost constraints among two or more types of resource. A reasonably realistic operational scenario was postulated. The costs were estimated from current experience with the measurement procedures involved, and the correlations were estimated from a set of 39 LACIE-type sample segments located in the U.S. Corn Belt. For a fixed variance of the estimate, double sampling with the two existing LANDSAT measurement procedures can result in a 25% or 50% cost reduction. Double sampling which included the fictional perfect procedure results in a more cost effective combination when it is used with the lower cost/higher variance representative of the existing procedures.
Automatic Bayes Factors for Testing Equality- and Inequality-Constrained Hypotheses on Variances.
Böing-Messing, Florian; Mulder, Joris
2018-05-03
In comparing characteristics of independent populations, researchers frequently expect a certain structure of the population variances. These expectations can be formulated as hypotheses with equality and/or inequality constraints on the variances. In this article, we consider the Bayes factor for testing such (in)equality-constrained hypotheses on variances. Application of Bayes factors requires specification of a prior under every hypothesis to be tested. However, specifying subjective priors for variances based on prior information is a difficult task. We therefore consider so-called automatic or default Bayes factors. These methods avoid the need for the user to specify priors by using information from the sample data. We present three automatic Bayes factors for testing variances. The first is a Bayes factor with equal priors on all variances, where the priors are specified automatically using a small share of the information in the sample data. The second is the fractional Bayes factor, where a fraction of the likelihood is used for automatic prior specification. The third is an adjustment of the fractional Bayes factor such that the parsimony of inequality-constrained hypotheses is properly taken into account. The Bayes factors are evaluated by investigating different properties such as information consistency and large sample consistency. Based on this evaluation, it is concluded that the adjusted fractional Bayes factor is generally recommendable for testing equality- and inequality-constrained hypotheses on variances.
Spectral analysis comparisons of Fourier-theory-based methods and minimum variance (Capon) methods
NASA Astrophysics Data System (ADS)
Garbanzo-Salas, Marcial; Hocking, Wayne. K.
2015-09-01
In recent years, adaptive (data dependent) methods have been introduced into many areas where Fourier spectral analysis has traditionally been used. Although the data-dependent methods are often advanced as being superior to Fourier methods, they do require some finesse in choosing the order of the relevant filters. In performing comparisons, we have found some concerns about the mappings, particularly when related to cases involving many spectral lines or even continuous spectral signals. Using numerical simulations, several comparisons between Fourier transform procedures and minimum variance method (MVM) have been performed. For multiple frequency signals, the MVM resolves most of the frequency content only for filters that have more degrees of freedom than the number of distinct spectral lines in the signal. In the case of Gaussian spectral approximation, MVM will always underestimate the width, and can misappropriate the location of spectral line in some circumstances. Large filters can be used to improve results with multiple frequency signals, but are computationally inefficient. Significant biases can occur when using MVM to study spectral information or echo power from the atmosphere. Artifacts and artificial narrowing of turbulent layers is one such impact.
The performance of matched-field track-before-detect methods using shallow-water Pacific data.
Tantum, Stacy L; Nolte, Loren W; Krolik, Jeffrey L; Harmanci, Kerem
2002-07-01
Matched-field track-before-detect processing, which extends the concept of matched-field processing to include modeling of the source dynamics, has recently emerged as a promising approach for maintaining the track of a moving source. In this paper, optimal Bayesian and minimum variance beamforming track-before-detect algorithms which incorporate a priori knowledge of the source dynamics in addition to the underlying uncertainties in the ocean environment are presented. A Markov model is utilized for the source motion as a means of capturing the stochastic nature of the source dynamics without assuming uniform motion. In addition, the relationship between optimal Bayesian track-before-detect processing and minimum variance track-before-detect beamforming is examined, revealing how an optimal tracking philosophy may be used to guide the modification of existing beamforming techniques to incorporate track-before-detect capabilities. Further, the benefits of implementing an optimal approach over conventional methods are illustrated through application of these methods to shallow-water Pacific data collected as part of the SWellEX-1 experiment. The results show that incorporating Markovian dynamics for the source motion provides marked improvement in the ability to maintain target track without the use of a uniform velocity hypothesis.
VizieR Online Data Catalog: AGNs in submm-selected Lockman Hole galaxies (Serjeant+, 2010)
NASA Astrophysics Data System (ADS)
Serjeant, S.; Negrello, M.; Pearson, C.; Mortier, A.; Austermann, J.; Aretxaga, I.; Clements, D.; Chapman, S.; Dye, S.; Dunlop, J.; Dunne, L.; Farrah, D.; Hughes, D.; Lee, H. M.; Matsuhara, H.; Ibar, E.; Im, M.; Jeong, W.-S.; Kim, S.; Oyabu, S.; Takagi, T.; Wada, T.; Wilson, G.; Vaccari, M.; Yun, M.
2013-11-01
We present a comparison of the SCUBA half degree extragalactic survey (SHADES) at 450μm, 850μm and 1100μm with deep guaranteed time 15μm AKARI FU-HYU survey data and Spitzer guaranteed time data at 3.6-24μm in the Lockman hole east. The AKARI data was analysed using bespoke software based in part on the drizzling and minimum-variance matched filtering developed for SHADES, and was cross-calibrated against ISO fluxes. (2 data files).
Eaton, Jeffrey W.; Bao, Le
2017-01-01
Objectives The aim of the study was to propose and demonstrate an approach to allow additional nonsampling uncertainty about HIV prevalence measured at antenatal clinic sentinel surveillance (ANC-SS) in model-based inferences about trends in HIV incidence and prevalence. Design Mathematical model fitted to surveillance data with Bayesian inference. Methods We introduce a variance inflation parameter σinfl2 that accounts for the uncertainty of nonsampling errors in ANC-SS prevalence. It is additive to the sampling error variance. Three approaches are tested for estimating σinfl2 using ANC-SS and household survey data from 40 subnational regions in nine countries in sub-Saharan, as defined in UNAIDS 2016 estimates. Methods were compared using in-sample fit and out-of-sample prediction of ANC-SS data, fit to household survey prevalence data, and the computational implications. Results Introducing the additional variance parameter σinfl2 increased the error variance around ANC-SS prevalence observations by a median of 2.7 times (interquartile range 1.9–3.8). Using only sampling error in ANC-SS prevalence ( σinfl2=0), coverage of 95% prediction intervals was 69% in out-of-sample prediction tests. This increased to 90% after introducing the additional variance parameter σinfl2. The revised probabilistic model improved model fit to household survey prevalence and increased epidemic uncertainty intervals most during the early epidemic period before 2005. Estimating σinfl2 did not increase the computational cost of model fitting. Conclusions: We recommend estimating nonsampling error in ANC-SS as an additional parameter in Bayesian inference using the Estimation and Projection Package model. This approach may prove useful for incorporating other data sources such as routine prevalence from Prevention of mother-to-child transmission testing into future epidemic estimates. PMID:28296801
Robert B. Thomas; Jack Lewis
1993-01-01
Time-stratified sampling of sediment for estimating suspended load is introduced and compared to selection at list time (SALT) sampling. Both methods provide unbiased estimates of load and variance. The magnitude of the variance of the two methods is compared using five storm populations of suspended sediment flux derived from turbidity data. Under like conditions,...
Statistical classification techniques for engineering and climatic data samples
NASA Technical Reports Server (NTRS)
Temple, E. C.; Shipman, J. R.
1981-01-01
Fisher's sample linear discriminant function is modified through an appropriate alteration of the common sample variance-covariance matrix. The alteration consists of adding nonnegative values to the eigenvalues of the sample variance covariance matrix. The desired results of this modification is to increase the number of correct classifications by the new linear discriminant function over Fisher's function. This study is limited to the two-group discriminant problem.
Bootstrap Estimation and Testing for Variance Equality.
ERIC Educational Resources Information Center
Olejnik, Stephen; Algina, James
The purpose of this study was to develop a single procedure for comparing population variances which could be used for distribution forms. Bootstrap methodology was used to estimate the variability of the sample variance statistic when the population distribution was normal, platykurtic and leptokurtic. The data for the study were generated and…
Minimum number of measurements for evaluating soursop (Annona muricata L.) yield.
Sánchez, C F B; Teodoro, P E; Londoño, S; Silva, L A; Peixoto, L A; Bhering, L L
2017-05-31
Repeatability studies on fruit species are of great importance to identify the minimum number of measurements necessary to accurately select superior genotypes. This study aimed to identify the most efficient method to estimate the repeatability coefficient (r) and predict the minimum number of measurements needed for a more accurate evaluation of soursop (Annona muricata L.) genotypes based on fruit yield. Sixteen measurements of fruit yield from 71 soursop genotypes were carried out between 2000 and 2016. In order to estimate r with the best accuracy, four procedures were used: analysis of variance, principal component analysis based on the correlation matrix, principal component analysis based on the phenotypic variance and covariance matrix, and structural analysis based on the correlation matrix. The minimum number of measurements needed to predict the actual value of individuals was estimated. Principal component analysis using the phenotypic variance and covariance matrix provided the most accurate estimates of both r and the number of measurements required for accurate evaluation of fruit yield in soursop. Our results indicate that selection of soursop genotypes with high fruit yield can be performed based on the third and fourth measurements in the early years and/or based on the eighth and ninth measurements at more advanced stages.
Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R
2007-01-01
Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. PMID:17605789
Kriging analysis of mean annual precipitation, Powder River Basin, Montana and Wyoming
Karlinger, M.R.; Skrivan, James A.
1981-01-01
Kriging is a statistical estimation technique for regionalized variables which exhibit an autocorrelation structure. Such structure can be described by a semi-variogram of the observed data. The kriging estimate at any point is a weighted average of the data, where the weights are determined using the semi-variogram and an assumed drift, or lack of drift, in the data. Block, or areal, estimates can also be calculated. The kriging algorithm, based on unbiased and minimum-variance estimates, involves a linear system of equations to calculate the weights. Kriging variances can then be used to give confidence intervals of the resulting estimates. Mean annual precipitation in the Powder River basin, Montana and Wyoming, is an important variable when considering restoration of coal-strip-mining lands of the region. Two kriging analyses involving data at 60 stations were made--one assuming no drift in precipitation, and one a partial quadratic drift simulating orographic effects. Contour maps of estimates of mean annual precipitation were similar for both analyses, as were the corresponding contours of kriging variances. Block estimates of mean annual precipitation were made for two subbasins. Runoff estimates were 1-2 percent of the kriged block estimates. (USGS)
Thermospheric mass density model error variance as a function of time scale
NASA Astrophysics Data System (ADS)
Emmert, J. T.; Sutton, E. K.
2017-12-01
In the increasingly crowded low-Earth orbit environment, accurate estimation of orbit prediction uncertainties is essential for collision avoidance. Poor characterization of such uncertainty can result in unnecessary and costly avoidance maneuvers (false positives) or disregard of a collision risk (false negatives). Atmospheric drag is a major source of orbit prediction uncertainty, and is particularly challenging to account for because it exerts a cumulative influence on orbital trajectories and is therefore not amenable to representation by a single uncertainty parameter. To address this challenge, we examine the variance of measured accelerometer-derived and orbit-derived mass densities with respect to predictions by thermospheric empirical models, using the data-minus-model variance as a proxy for model uncertainty. Our analysis focuses mainly on the power spectrum of the residuals, and we construct an empirical model of the variance as a function of time scale (from 1 hour to 10 years), altitude, and solar activity. We find that the power spectral density approximately follows a power-law process but with an enhancement near the 27-day solar rotation period. The residual variance increases monotonically with altitude between 250 and 550 km. There are two components to the variance dependence on solar activity: one component is 180 degrees out of phase (largest variance at solar minimum), and the other component lags 2 years behind solar maximum (largest variance in the descending phase of the solar cycle).
Breslow, Norman E.; Lumley, Thomas; Ballantyne, Christie M; Chambless, Lloyd E.; Kulich, Michal
2009-01-01
The case-cohort study involves two-phase sampling: simple random sampling from an infinite super-population at phase one and stratified random sampling from a finite cohort at phase two. Standard analyses of case-cohort data involve solution of inverse probability weighted (IPW) estimating equations, with weights determined by the known phase two sampling fractions. The variance of parameter estimates in (semi)parametric models, including the Cox model, is the sum of two terms: (i) the model based variance of the usual estimates that would be calculated if full data were available for the entire cohort; and (ii) the design based variance from IPW estimation of the unknown cohort total of the efficient influence function (IF) contributions. This second variance component may be reduced by adjusting the sampling weights, either by calibration to known cohort totals of auxiliary variables correlated with the IF contributions or by their estimation using these same auxiliary variables. Both adjustment methods are implemented in the R survey package. We derive the limit laws of coefficients estimated using adjusted weights. The asymptotic results suggest practical methods for construction of auxiliary variables that are evaluated by simulation of case-cohort samples from the National Wilms Tumor Study and by log-linear modeling of case-cohort data from the Atherosclerosis Risk in Communities Study. Although not semiparametric efficient, estimators based on adjusted weights may come close to achieving full efficiency within the class of augmented IPW estimators. PMID:20174455
Analysis of 20 magnetic clouds at 1 AU during a solar minimum
NASA Astrophysics Data System (ADS)
Gulisano, A. M.; Dasso, S.; Mandrini, C. H.; Démoulin, P.
We study 20 magnetic clouds, observed in situ by the spacecraft Wind, at the Lagrangian point L1, from 22 August, 1995, to 7 November, 1997. In previous works, assuming a cylindrical symmetry for the local magnetic configuration and a satellite trajectory crossing the axis of the cloud, we obtained their orientations using a minimum variance analysis. In this work we compute the orientations and magnetic configurations using a non-linear simultaneous fit of the geometric and physical parameters for a linear force-free model, including the possibility of a not null impact parameter. We quantify global magnitudes such as the relative magnetic helicity per unit length and compare the values found with both methods (minimum variance and the simultaneous fit). FULL TEXT IN SPANISH
* Minimum # Experimental Samples DNA Volume (ul) Genomic DNA Concentration (ng/ul) Low Input DNA Volume (ul . **Please inquire about additional cost for low input option. Genotyping Minimum # Experimental Samples DNA sample quality. If you do submit WGA samples, you should anticipate a higher non-random missing data rate
Effects of low sampling rate in the digital data-transition tracking loop
NASA Technical Reports Server (NTRS)
Mileant, A.; Million, S.; Hinedi, S.
1994-01-01
This article describes the performance of the all-digital data-transition tracking loop (DTTL) with coherent and noncoherent sampling using nonlinear theory. The effects of few samples per symbol and of noncommensurate sampling and symbol rates are addressed and analyzed. Their impact on the probability density and variance of the phase error are quantified through computer simulations. It is shown that the performance of the all-digital DTTL approaches its analog counterpart when the sampling and symbol rates are noncommensurate (i.e., the number of samples per symbol is an irrational number). The loop signal-to-noise ratio (SNR) (inverse of phase error variance) degrades when the number of samples per symbol is an odd integer but degrades even further for even integers.
RESPONDENT-DRIVEN SAMPLING AS MARKOV CHAIN MONTE CARLO
GOEL, SHARAD; SALGANIK, MATTHEW J.
2013-01-01
Respondent-driven sampling (RDS) is a recently introduced, and now widely used, technique for estimating disease prevalence in hidden populations. RDS data are collected through a snowball mechanism, in which current sample members recruit future sample members. In this paper we present respondent-driven sampling as Markov chain Monte Carlo (MCMC) importance sampling, and we examine the effects of community structure and the recruitment procedure on the variance of RDS estimates. Past work has assumed that the variance of RDS estimates is primarily affected by segregation between healthy and infected individuals. We examine an illustrative model to show that this is not necessarily the case, and that bottlenecks anywhere in the networks can substantially affect estimates. We also show that variance is inflated by a common design feature in which sample members are encouraged to recruit multiple future sample members. The paper concludes with suggestions for implementing and evaluating respondent-driven sampling studies. PMID:19572381
Oregon ground-water quality and its relation to hydrogeological factors; a statistical approach
Miller, T.L.; Gonthier, J.B.
1984-01-01
An appraisal of Oregon ground-water quality was made using existing data accessible through the U.S. Geological Survey computer system. The data available for about 1,000 sites were separated by aquifer units and hydrologic units. Selected statistical moments were described for 19 constituents including major ions. About 96 percent of all sites in the data base were sampled only once. The sample data were classified by aquifer unit and hydrologic unit and analysis of variance was run to determine if significant differences exist between the units within each of these two classifications for the same 19 constituents on which statistical moments were determined. Results of the analysis of variance indicated both classification variables performed about the same, but aquifer unit did provide more separation for some constituents. Samples from the Rogue River basin were classified by location within the flow system and type of flow system. The samples were then analyzed using analysis of variance on 14 constituents to determine if there were significant differences between subsets classified by flow path. Results of this analysis were not definitive, but classification as to the type of flow system did indicate potential for segregating water-quality data into distinct subsets. (USGS)
NASA Astrophysics Data System (ADS)
Feng, Wenjie; Wu, Shenghe; Yin, Yanshu; Zhang, Jiajia; Zhang, Ke
2017-07-01
A training image (TI) can be regarded as a database of spatial structures and their low to higher order statistics used in multiple-point geostatistics (MPS) simulation. Presently, there are a number of methods to construct a series of candidate TIs (CTIs) for MPS simulation based on a modeler's subjective criteria. The spatial structures of TIs are often various, meaning that the compatibilities of different CTIs with the conditioning data are different. Therefore, evaluation and optimal selection of CTIs before MPS simulation is essential. This paper proposes a CTI evaluation and optimal selection method based on minimum data event distance (MDevD). In the proposed method, a set of MDevD properties are established through calculation of the MDevD of conditioning data events in each CTI. Then, CTIs are evaluated and ranked according to the mean value and variance of the MDevD properties. The smaller the mean value and variance of an MDevD property are, the more compatible the corresponding CTI is with the conditioning data. In addition, data events with low compatibility in the conditioning data grid can be located to help modelers select a set of complementary CTIs for MPS simulation. The MDevD property can also help to narrow the range of the distance threshold for MPS simulation. The proposed method was evaluated using three examples: a 2D categorical example, a 2D continuous example, and an actual 3D oil reservoir case study. To illustrate the method, a C++ implementation of the method is attached to the paper.
A de-noising method using the improved wavelet threshold function based on noise variance estimation
NASA Astrophysics Data System (ADS)
Liu, Hui; Wang, Weida; Xiang, Changle; Han, Lijin; Nie, Haizhao
2018-01-01
The precise and efficient noise variance estimation is very important for the processing of all kinds of signals while using the wavelet transform to analyze signals and extract signal features. In view of the problem that the accuracy of traditional noise variance estimation is greatly affected by the fluctuation of noise values, this study puts forward the strategy of using the two-state Gaussian mixture model to classify the high-frequency wavelet coefficients in the minimum scale, which takes both the efficiency and accuracy into account. According to the noise variance estimation, a novel improved wavelet threshold function is proposed by combining the advantages of hard and soft threshold functions, and on the basis of the noise variance estimation algorithm and the improved wavelet threshold function, the research puts forth a novel wavelet threshold de-noising method. The method is tested and validated using random signals and bench test data of an electro-mechanical transmission system. The test results indicate that the wavelet threshold de-noising method based on the noise variance estimation shows preferable performance in processing the testing signals of the electro-mechanical transmission system: it can effectively eliminate the interference of transient signals including voltage, current, and oil pressure and maintain the dynamic characteristics of the signals favorably.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luis, Alfredo
The use of Renyi entropy as an uncertainty measure alternative to variance leads to the study of states with quantum fluctuations below the levels established by Gaussian states, which are the position-momentum minimum uncertainty states according to variance. We examine the quantum properties of states with exponential wave functions, which combine reduced fluctuations with practical feasibility.
Picone, Marco; Bergamin, Martina; Delaney, Eugenia; Ghirardini, Annamaria Volpi; Kusk, Kresten Ole
2018-01-01
The early-life stages of development of the calanoid copepod Acartia tonsa from egg to copepodite I is proposed as an endpoint for assessing sediment toxicity by exposing newly released eggs directly onto the sediment-water interface. A preliminary study of 5 sediment samples collected in the lagoon of Venice highlighted that the larval development rate (LDR) and the early-life stages (ELS) mortality endpoints with A. tonsa are more sensitive than the standard amphipod mortality test; moreover LDR resulted in a more reliable endpoint than ELS mortality, due to the interference of the sediment with the recovery of unhatched eggs and dead larvae. The LDR data collected in a definitive study of 48 sediment samples from the Venice Lagoon has been analysed together with the preliminary data to evaluate the statistical performances of the bioassay (among replicate variance and minimum significant difference between samples and control) and to investigate the possible correlation with sediment chemistry and physical properties. The results showed that statistical performances of the LDR test with A. tonsa correspond with the outcomes of other tests applied to the sediment-water interface (Strongylocentrotus purpuratus embryotoxicity test), sediments (Neanthes arenaceodentata survival and growth test) and porewater (S. purpuratus); the LDR endpoint did, however, show a slightly higher variance as compared with other tests used in the Lagoon of Venice, such as 10-d amphipod lethality test and larval development with sea urchin and bivalves embryos. Sediment toxicity data highlighted the high sensitivity and the clear ability of the larval development to discriminate among sediments characterized by different levels of contamination. The data of the definitive study evidenced that inhibition of the larval development was not affected by grain-size and the organic carbon content of the sediment; in contrast, a strong correlation between inhibition of the larval development and the sediment concentrations of some metals (Cu, Hg, Pb, Zn), acid-volatile sulphides (AVS), polychlorinated biphenyls (PCBs) and polynuclear aromatic hydrocarbons (PAHs) was found. No correlation was found with DDTs, hexachlorobenzene and organotin compounds. Copyright © 2017 Elsevier Inc. All rights reserved.
Spatial Prediction and Optimized Sampling Design for Sodium Concentration in Groundwater
Shabbir, Javid; M. AbdEl-Salam, Nasser; Hussain, Tajammal
2016-01-01
Sodium is an integral part of water, and its excessive amount in drinking water causes high blood pressure and hypertension. In the present paper, spatial distribution of sodium concentration in drinking water is modeled and optimized sampling designs for selecting sampling locations is calculated for three divisions in Punjab, Pakistan. Universal kriging and Bayesian universal kriging are used to predict the sodium concentrations. Spatial simulated annealing is used to generate optimized sampling designs. Different estimation methods (i.e., maximum likelihood, restricted maximum likelihood, ordinary least squares, and weighted least squares) are used to estimate the parameters of the variogram model (i.e, exponential, Gaussian, spherical and cubic). It is concluded that Bayesian universal kriging fits better than universal kriging. It is also observed that the universal kriging predictor provides minimum mean universal kriging variance for both adding and deleting locations during sampling design. PMID:27683016
Simulating future uncertainty to guide the selection of survey designs for long-term monitoring
Garman, Steven L.; Schweiger, E. William; Manier, Daniel J.; Gitzen, Robert A.; Millspaugh, Joshua J.; Cooper, Andrew B.; Licht, Daniel S.
2012-01-01
A goal of environmental monitoring is to provide sound information on the status and trends of natural resources (Messer et al. 1991, Theobald et al. 2007, Fancy et al. 2009). When monitoring observations are acquired by measuring a subset of the population of interest, probability sampling as part of a well-constructed survey design provides the most reliable and legally defensible approach to achieve this goal (Cochran 1977, Olsen et al. 1999, Schreuder et al. 2004; see Chapters 2, 5, 6, 7). Previous works have described the fundamentals of sample surveys (e.g. Hansen et al. 1953, Kish 1965). Interest in survey designs and monitoring over the past 15 years has led to extensive evaluations and new developments of sample selection methods (Stevens and Olsen 2004), of strategies for allocating sample units in space and time (Urquhart et al. 1993, Overton and Stehman 1996, Urquhart and Kincaid 1999), and of estimation (Lesser and Overton 1994, Overton and Stehman 1995) and variance properties (Larsen et al. 1995, Stevens and Olsen 2003) of survey designs. Carefully planned, “scientific” (Chapter 5) survey designs have become a standard in contemporary monitoring of natural resources. Based on our experience with the long-term monitoring program of the US National Park Service (NPS; Fancy et al. 2009; Chapters 16, 22), operational survey designs tend to be selected using the following procedures. For a monitoring indicator (i.e. variable or response), a minimum detectable trend requirement is specified, based on the minimum level of change that would result in meaningful change (e.g. degradation). A probability of detecting this trend (statistical power) and an acceptable level of uncertainty (Type I error; see Chapter 2) within a specified time frame (e.g. 10 years) are specified to ensure timely detection. Explicit statements of the minimum detectable trend, the time frame for detecting the minimum trend, power, and acceptable probability of Type I error (α) collectively form the quantitative sampling objective.
Prediction of episodic acidification in North-eastern USA: An empirical/mechanistic approach
Davies, T.D.; Tranter, M.; Wigington, P.J.; Eshleman, K.N.; Peters, N.E.; Van Sickle, J.; DeWalle, David R.; Murdoch, Peter S.
1999-01-01
Observations from the US Environmental Protection Agency's Episodic Response Project (ERP) in the North-eastern United States are used to develop an empirical/mechanistic scheme for prediction of the minimum values of acid neutralizing capacity (ANC) during episodes. An acidification episode is defined as a hydrological event during which ANC decreases. The pre-episode ANC is used to index the antecedent condition, and the stream flow increase reflects how much the relative contributions of sources of waters change during the episode. As much as 92% of the total variation in the minimum ANC in individual catchments can be explained (with levels of explanation >70% for nine of the 13 streams) by a multiple linear regression model that includes pre-episode ANC and change in discharge as independent variable. The predictive scheme is demonstrated to be regionally robust, with the regional variance explained ranging from 77 to 83%. The scheme is not successful for each ERP stream, and reasons are suggested for the individual failures. The potential for applying the predictive scheme to other watersheds is demonstrated by testing the model with data from the Panola Mountain Research Watershed in the South-eastern United States, where the variance explained by the model was 74%. The model can also be utilized to assess 'chemically new' and 'chemically old' water sources during acidification episodes.Observations from the US Environmental Protection Agency's Episodic Response Project (ERP) in the Northeastern United States are used to develop an empirical/mechanistic scheme for prediction of the minimum values of acid neutralizing capacity (ANC) during episodes. An acidification episode is defined as a hydrological event during which ANC decreases. The pre-episode ANC is used to index the antecedent condition, and the stream flow increase reflects how much the relative contributions of sources of waters change during the episode. As much as 92% of the total variation in the minimum ANC in individual catchments can be explained (with levels of explanation >70% for nine of the 13 streams) by a multiple linear regression model that includes pre-episode ANC and change in discharge as independent variables. The predictive scheme is demonstrated to be regionally robust, with the regional variance explained ranging from 77 to 83%. The scheme is not successful for each ERP stream, and reasons are suggested for the individual failures. The potential for applying the predictive scheme to other watersheds is demonstrated by testing the model with data from the Panola Mountain Research Watershed in the South-eastern United States, where the variance explained by the model was 74%. The model can also be utilized to assess `chemically new' and `chemically old' water sources during acidification episodes.
Empirical data and the variance-covariance matrix for the 1969 Smithsonian Standard Earth (2)
NASA Technical Reports Server (NTRS)
Gaposchkin, E. M.
1972-01-01
The empirical data used in the 1969 Smithsonian Standard Earth (2) are presented. The variance-covariance matrix, or the normal equations, used for correlation analysis, are considered. The format and contents of the matrix, available on magnetic tape, are described and a sample printout is given.
ERIC Educational Resources Information Center
Vista, Alvin; Care, Esther
2011-01-01
Background: Research on gender differences in intelligence has focused mostly on samples from Western countries and empirical evidence on gender differences from Southeast Asia is relatively sparse. Aims: This article presents results on gender differences in variance and means on a non-verbal intelligence test using a national sample of public…
An evaluation of soil sampling for 137Cs using various field-sampling volumes.
Nyhan, J W; White, G C; Schofield, T G; Trujillo, G
1983-05-01
The sediments from a liquid effluent receiving area at the Los Alamos National Laboratory and soils from an intensive study area in the fallout pathway of Trinity were sampled for 137Cs using 25-, 500-, 2500- and 12,500-cm3 field sampling volumes. A highly replicated sampling program was used to determine mean concentrations and inventories of 137Cs at each site, as well as estimates of spatial, aliquoting, and counting variance components of the radionuclide data. The sampling methods were also analyzed as a function of soil size fractions collected in each field sampling volume and of the total cost of the program for a given variation in the radionuclide survey results. Coefficients of variation (CV) of 137Cs inventory estimates ranged from 0.063 to 0.14 for Mortandad Canyon sediments, whereas CV values for Trinity soils were observed from 0.38 to 0.57. Spatial variance components of 137Cs concentration data were usually found to be larger than either the aliquoting or counting variance estimates and were inversely related to field sampling volume at the Trinity intensive site. Subsequent optimization studies of the sampling schemes demonstrated that each aliquot should be counted once, and that only 2-4 aliquots out of as many as 30 collected need be assayed for 137Cs. The optimization studies showed that as sample costs increased to 45 man-hours of labor per sample, the variance of the mean 137Cs concentration decreased dramatically, but decreased very little with additional labor.
Respondent-driven sampling as Markov chain Monte Carlo.
Goel, Sharad; Salganik, Matthew J
2009-07-30
Respondent-driven sampling (RDS) is a recently introduced, and now widely used, technique for estimating disease prevalence in hidden populations. RDS data are collected through a snowball mechanism, in which current sample members recruit future sample members. In this paper we present RDS as Markov chain Monte Carlo importance sampling, and we examine the effects of community structure and the recruitment procedure on the variance of RDS estimates. Past work has assumed that the variance of RDS estimates is primarily affected by segregation between healthy and infected individuals. We examine an illustrative model to show that this is not necessarily the case, and that bottlenecks anywhere in the networks can substantially affect estimates. We also show that variance is inflated by a common design feature in which the sample members are encouraged to recruit multiple future sample members. The paper concludes with suggestions for implementing and evaluating RDS studies.
Age-specific survival of male golden-cheeked warblers on the Fort Hood Military Reservation, Texas
Duarte, Adam; Hines, James E.; Nichols, James D.; Hatfield, Jeffrey S.; Weckerly, Floyd W.
2014-01-01
Population models are essential components of large-scale conservation and management plans for the federally endangered Golden-cheeked Warbler (Setophaga chrysoparia; hereafter GCWA). However, existing models are based on vital rate estimates calculated using relatively small data sets that are now more than a decade old. We estimated more current, precise adult and juvenile apparent survival (Φ) probabilities and their associated variances for male GCWAs. In addition to providing estimates for use in population modeling, we tested hypotheses about spatial and temporal variation in Φ. We assessed whether a linear trend in Φ or a change in the overall mean Φ corresponded to an observed increase in GCWA abundance during 1992-2000 and if Φ varied among study plots. To accomplish these objectives, we analyzed long-term GCWA capture-resight data from 1992 through 2011, collected across seven study plots on the Fort Hood Military Reservation using a Cormack-Jolly-Seber model structure within program MARK. We also estimated Φ process and sampling variances using a variance-components approach. Our results did not provide evidence of site-specific variation in adult Φ on the installation. Because of a lack of data, we could not assess whether juvenile Φ varied spatially. We did not detect a strong temporal association between GCWA abundance and Φ. Mean estimates of Φ for adult and juvenile male GCWAs for all years analyzed were 0.47 with a process variance of 0.0120 and a sampling variance of 0.0113 and 0.28 with a process variance of 0.0076 and a sampling variance of 0.0149, respectively. Although juvenile Φ did not differ greatly from previous estimates, our adult Φ estimate suggests previous GCWA population models were overly optimistic with respect to adult survival. These updated Φ probabilities and their associated variances will be incorporated into new population models to assist with GCWA conservation decision making.
The use of spatio-temporal correlation to forecast critical transitions
NASA Astrophysics Data System (ADS)
Karssenberg, Derek; Bierkens, Marc F. P.
2010-05-01
Complex dynamical systems may have critical thresholds at which the system shifts abruptly from one state to another. Such critical transitions have been observed in systems ranging from the human body system to financial markets and the Earth system. Forecasting the timing of critical transitions before they are reached is of paramount importance because critical transitions are associated with a large shift in dynamical regime of the system under consideration. However, it is hard to forecast critical transitions, because the state of the system shows relatively little change before the threshold is reached. Recently, it was shown that increased spatio-temporal autocorrelation and variance can serve as alternative early warning signal for critical transitions. However, thus far these second order statistics have not been used for forecasting in a data assimilation framework. Here we show that the use of spatio-temporal autocorrelation and variance in the state of the system reduces the uncertainty in the predicted timing of critical transitions compared to classical approaches that use the value of the system state only. This is shown by assimilating observed spatio-temporal autocorrelation and variance into a dynamical system model using a Particle Filter. We adapt a well-studied distributed model of a logistically growing resource with a fixed grazing rate. The model describes the transition from an underexploited system with high resource biomass to overexploitation as grazing pressure crosses the critical threshold, which is a fold bifurcation. To represent limited prior information, we use a large variance in the prior probability distributions of model parameters and the system driver (grazing rate). First, we show that the rate of increase in spatio-temporal autocorrelation and variance prior to reaching the critical threshold is relatively consistent across the uncertainty range of the driver and parameter values used. This indicates that an increase in spatio-temporal autocorrelation and variance are consistent predictors of a critical transition, even under the condition of a poorly defined system. Second, we perform data assimilation experiments using an artificial exhaustive data set generated by one realization of the model. To mimic real-world sampling, an observational data set is created from this exhaustive data set. This is done by sampling on a regular spatio-temporal grid, supplemented by sampling locations at a short distance. Spatial and temporal autocorrelation in this observational data set is calculated for different spatial and temporal separation (lag) distances. To assign appropriate weights to observations (here, autocorrelation values and variance) in the Particle Filter, the covariance matrix of the error in these observations is required. This covariance matrix is estimated using Monte Carlo sampling, selecting a different random position of the sampling network relative to the exhaustive data set for each realization. At each update moment in the Particle Filter, observed autocorrelation values are assimilated into the model and the state of the model is updated. Using this approach, it is shown that the use of autocorrelation reduces the uncertainty in the forecasted timing of a critical transition compared to runs without data assimilation. The performance of the use of spatial autocorrelation versus temporal autocorrelation depends on the timing and number of observational data. This study is restricted to a single model only. However, it is becoming increasingly clear that spatio-temporal autocorrelation and variance can be used as early warning signals for a large number of systems. Thus, it is expected that spatio-temporal autocorrelation and variance are valuable in data assimilation frameworks in a large number of dynamical systems.
The Effect of Minimum Wages on Youth Employment in Canada: A Panel Study.
ERIC Educational Resources Information Center
Yuen, Terence
2003-01-01
Canadian panel data 1988-90 were used to compare estimates of minimum-wage effects based on a low-wage/high-worker sample and a low-wage-only sample. Minimum-wage effect for the latter is nearly zero. Different results for low-wage subgroups suggest a significant effect for those with longer low-wage histories. (Contains 26 references.) (SK)
Variance partitioning of stream diatom, fish, and invertebrate indicators of biological condition
Zuellig, Robert E.; Carlisle, Daren M.; Meador, Michael R.; Potapova, Marina
2012-01-01
Stream indicators used to make assessments of biological condition are influenced by many possible sources of variability. To examine this issue, we used multiple-year and multiple-reach diatom, fish, and invertebrate data collected from 20 least-disturbed and 46 developed stream segments between 1993 and 2004 as part of the US Geological Survey National Water Quality Assessment Program. We used a variance-component model to summarize the relative and absolute magnitude of 4 variance components (among-site, among-year, site × year interaction, and residual) in indicator values (observed/expected ratio [O/E] and regional multimetric indices [MMI]) among assemblages and between basin types (least-disturbed and developed). We used multiple-reach samples to evaluate discordance in site assessments of biological condition caused by sampling variability. Overall, patterns in variance partitioning were similar among assemblages and basin types with one exception. Among-site variance dominated the relative contribution to the total variance (64–80% of total variance), residual variance (sampling variance) accounted for more variability (8–26%) than interaction variance (5–12%), and among-year variance was always negligible (0–0.2%). The exception to this general pattern was for invertebrates at least-disturbed sites where variability in O/E indicators was partitioned between among-site and residual (sampling) variance (among-site = 36%, residual = 64%). This pattern was not observed for fish and diatom indicators (O/E and regional MMI). We suspect that unexplained sampling variability is what largely remained after the invertebrate indicators (O/E predictive models) had accounted for environmental differences among least-disturbed sites. The influence of sampling variability on discordance of within-site assessments was assemblage or basin-type specific. Discordance among assessments was nearly 2× greater in developed basins (29–31%) than in least-disturbed sites (15–16%) for invertebrates and diatoms, whereas discordance among assessments based on fish did not differ between basin types (least-disturbed = 16%, developed = 17%). Assessments made using invertebrate and diatom indicators from a single reach disagreed with other samples collected within the same stream segment nearly ⅓ of the time in developed basins, compared to ⅙ for all other cases.
Estimation of the simple correlation coefficient.
Shieh, Gwowen
2010-11-01
This article investigates some unfamiliar properties of the Pearson product-moment correlation coefficient for the estimation of simple correlation coefficient. Although Pearson's r is biased, except for limited situations, and the minimum variance unbiased estimator has been proposed in the literature, researchers routinely employ the sample correlation coefficient in their practical applications, because of its simplicity and popularity. In order to support such practice, this study examines the mean squared errors of r and several prominent formulas. The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables. In addition, related issues of estimating the squared simple correlation coefficient are also considered.
Bayesian Factor Analysis When Only a Sample Covariance Matrix Is Available
ERIC Educational Resources Information Center
Hayashi, Kentaro; Arav, Marina
2006-01-01
In traditional factor analysis, the variance-covariance matrix or the correlation matrix has often been a form of inputting data. In contrast, in Bayesian factor analysis, the entire data set is typically required to compute the posterior estimates, such as Bayes factor loadings and Bayes unique variances. We propose a simple method for computing…
NASA Astrophysics Data System (ADS)
Lizana, A.; Foldyna, M.; Stchakovsky, M.; Georges, B.; Nicolas, D.; Garcia-Caurel, E.
2013-03-01
High sensitivity of spectroscopic ellipsometry and reflectometry for the characterization of thin films can strongly decrease when layers, typically metals, absorb a significant fraction of the light. In this paper, we propose a solution to overcome this drawback using total internal reflection ellipsometry (TIRE) and exciting a surface longitudinal wave: a plasmon-polariton. As in the attenuated total reflectance technique, TIRE exploits a minimum in the intensity of reflected transversal magnetic (TM) polarized light and enhances the sensitivity of standard methods to thicknesses of absorbing films. Samples under study were stacks of three films, ZnO : Al/Ag/ZnO : Al, deposited on glass substrates. The thickness of the silver layer varied from sample to sample. We performed measurements with a UV-visible phase-modulated ellipsometer, an IR Mueller ellipsometer and a UV-NIR reflectometer. We used the variance-covariance formalism to evaluate the sensitivity of the ellipsometric data to different parameters of the optical model. Results have shown that using TIRE doubled the sensitivity to the silver layer thickness when compared with the standard ellipsometry. Moreover, the thickness of the ZnO : Al layer below the silver layer can be reliably quantified, unlike for the fit of the standard ellipsometry data, which is limited by the absorption of the silver layer.
A Minimum Variance Algorithm for Overdetermined TOA Equations with an Altitude Constraint.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romero, Louis A; Mason, John J.
We present a direct (non-iterative) method for solving for the location of a radio frequency (RF) emitter, or an RF navigation receiver, using four or more time of arrival (TOA) measurements and an assumed altitude above an ellipsoidal earth. Both the emitter tracking problem and the navigation application are governed by the same equations, but with slightly different interpreta- tions of several variables. We treat the assumed altitude as a soft constraint, with a specified noise level, just as the TOA measurements are handled, with their respective noise levels. With 4 or more TOA measurements and the assumed altitude, themore » problem is overdetermined and is solved in the weighted least squares sense for the 4 unknowns, the 3-dimensional position and time. We call the new technique the TAQMV (TOA Altitude Quartic Minimum Variance) algorithm, and it achieves the minimum possible error variance for given levels of TOA and altitude estimate noise. The method algebraically produces four solutions, the least-squares solution, and potentially three other low residual solutions, if they exist. In the lightly overdermined cases where multiple local minima in the residual error surface are more likely to occur, this algebraic approach can produce all of the minima even when an iterative approach fails to converge. Algorithm performance in terms of solution error variance and divergence rate for bas eline (iterative) and proposed approach are given in tables.« less
Theodorsson-Norheim, E
1986-08-01
Multiple t tests at a fixed p level are frequently used to analyse biomedical data where analysis of variance followed by multiple comparisons or the adjustment of the p values according to Bonferroni would be more appropriate. The Kruskal-Wallis test is a nonparametric 'analysis of variance' which may be used to compare several independent samples. The present program is written in an elementary subset of BASIC and will perform Kruskal-Wallis test followed by multiple comparisons between the groups on practically any computer programmable in BASIC.
Empirical Bayes estimation of undercount in the decennial census.
Cressie, N
1989-12-01
Empirical Bayes methods are used to estimate the extent of the undercount at the local level in the 1980 U.S. census. "Grouping of like subareas from areas such as states, counties, and so on into strata is a useful way of reducing the variance of undercount estimators. By modeling the subareas within a stratum to have a common mean and variances inversely proportional to their census counts, and by taking into account sampling of the areas (e.g., by dual-system estimation), empirical Bayes estimators that compromise between the (weighted) stratum average and the sample value can be constructed. The amount of compromise is shown to depend on the relative importance of stratum variance to sampling variance. These estimators are evaluated at the state level (51 states, including Washington, D.C.) and stratified on race/ethnicity (3 strata) using data from the 1980 postenumeration survey (PEP 3-8, for the noninstitutional population)." excerpt
A method for minimum risk portfolio optimization under hybrid uncertainty
NASA Astrophysics Data System (ADS)
Egorova, Yu E.; Yazenin, A. V.
2018-03-01
In this paper, we investigate a minimum risk portfolio model under hybrid uncertainty when the profitability of financial assets is described by fuzzy random variables. According to Feng, the variance of a portfolio is defined as a crisp value. To aggregate fuzzy information the weakest (drastic) t-norm is used. We construct an equivalent stochastic problem of the minimum risk portfolio model and specify the stochastic penalty method for solving it.
Cohn, Timothy A.
2005-01-01
This paper presents an adjusted maximum likelihood estimator (AMLE) that can be used to estimate fluvial transport of contaminants, like phosphorus, that are subject to censoring because of analytical detection limits. The AMLE is a generalization of the widely accepted minimum variance unbiased estimator (MVUE), and Monte Carlo experiments confirm that it shares essentially all of the MVUE's desirable properties, including high efficiency and negligible bias. In particular, the AMLE exhibits substantially less bias than alternative censored‐data estimators such as the MLE (Tobit) or the MLE followed by a jackknife. As with the MLE and the MVUE the AMLE comes close to achieving the theoretical Frechet‐Cramér‐Rao bounds on its variance. This paper also presents a statistical framework, applicable to both censored and complete data, for understanding and estimating the components of uncertainty associated with load estimates. This can serve to lower the cost and improve the efficiency of both traditional and real‐time water quality monitoring.
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses
Liu, Ruijie; Holik, Aliaksei Z.; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E.; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.; Ritchie, Matthew E.
2015-01-01
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package. PMID:25925576
Kalman filter for statistical monitoring of forest cover across sub-continental regions [Symposium
Raymond L. Czaplewski
1991-01-01
The Kalman filter is a generalization of the composite estimator. The univariate composite estimate combines 2 prior estimates of population parameter with a weighted average where the scalar weight is inversely proportional to the variances. The composite estimator is a minimum variance estimator that requires no distributional assumptions other than estimates of the...
NASA Astrophysics Data System (ADS)
Setiawan, E. P.; Rosadi, D.
2017-01-01
Portfolio selection problems conventionally means ‘minimizing the risk, given the certain level of returns’ from some financial assets. This problem is frequently solved with quadratic or linear programming methods, depending on the risk measure that used in the objective function. However, the solutions obtained by these method are in real numbers, which may give some problem in real application because each asset usually has its minimum transaction lots. In the classical approach considering minimum transaction lots were developed based on linear Mean Absolute Deviation (MAD), variance (like Markowitz’s model), and semi-variance as risk measure. In this paper we investigated the portfolio selection methods with minimum transaction lots with conditional value at risk (CVaR) as risk measure. The mean-CVaR methodology only involves the part of the tail of the distribution that contributed to high losses. This approach looks better when we work with non-symmetric return probability distribution. Solution of this method can be found with Genetic Algorithm (GA) methods. We provide real examples using stocks from Indonesia stocks market.
Variance of discharge estimates sampled using acoustic Doppler current profilers from moving boats
Garcia, Carlos M.; Tarrab, Leticia; Oberg, Kevin; Szupiany, Ricardo; Cantero, Mariano I.
2012-01-01
This paper presents a model for quantifying the random errors (i.e., variance) of acoustic Doppler current profiler (ADCP) discharge measurements from moving boats for different sampling times. The model focuses on the random processes in the sampled flow field and has been developed using statistical methods currently available for uncertainty analysis of velocity time series. Analysis of field data collected using ADCP from moving boats from three natural rivers of varying sizes and flow conditions shows that, even though the estimate of the integral time scale of the actual turbulent flow field is larger than the sampling interval, the integral time scale of the sampled flow field is on the order of the sampling interval. Thus, an equation for computing the variance error in discharge measurements associated with different sampling times, assuming uncorrelated flow fields is appropriate. The approach is used to help define optimal sampling strategies by choosing the exposure time required for ADCPs to accurately measure flow discharge.
Lekking without a paradox in the buff-breasted sandpiper
Lanctot, Richard B.; Scribner, Kim T.; Kempenaers, Bart; Weatherhead, Patrick J.
1997-01-01
Females in lek‐breeding species appear to copulate with a small subset of the available males. Such strong directional selection is predicted to decrease additive genetic variance in the preferred male traits, yet females continue to mate selectively, thus generating the lek paradox. In a study of buff‐breasted sandpipers (Tryngites subruficollis), we combine detailed behavioral observations with paternity analyses using single‐locus minisatellite DNA probes to provide the first evidence from a lek‐breeding species that the variance in male reproductive success is much lower than expected. In 17 and 30 broods sampled in two consecutive years, a minimum of 20 and 39 males, respectively, sired offspring. This low variance in male reproductive success resulted from effective use of alternative reproductive tactics by males, females mating with solitary males off leks, and multiple mating by females. Thus, the results of this study suggests that sexual selection through female choice is weak in buff‐breasted sandpipers. The behavior of other lek‐breeding birds is sufficiently similar to that of buff‐breasted sandpipers that paternity studies of those species should be conducted to determine whether leks generally are less paradoxical than they appear.
Iterative Minimum Variance Beamformer with Low Complexity for Medical Ultrasound Imaging.
Deylami, Ali Mohades; Asl, Babak Mohammadzadeh
2018-06-04
Minimum variance beamformer (MVB) improves the resolution and contrast of medical ultrasound images compared with delay and sum (DAS) beamformer. The weight vector of this beamformer should be calculated for each imaging point independently, with a cost of increasing computational complexity. The large number of necessary calculations limits this beamformer to application in real-time systems. A beamformer is proposed based on the MVB with lower computational complexity while preserving its advantages. This beamformer avoids matrix inversion, which is the most complex part of the MVB, by solving the optimization problem iteratively. The received signals from two imaging points close together do not vary much in medical ultrasound imaging. Therefore, using the previously optimized weight vector for one point as initial weight vector for the new neighboring point can improve the convergence speed and decrease the computational complexity. The proposed method was applied on several data sets, and it has been shown that the method can regenerate the results obtained by the MVB while the order of complexity is decreased from O(L 3 ) to O(L 2 ). Copyright © 2018 World Federation for Ultrasound in Medicine and Biology. Published by Elsevier Inc. All rights reserved.
Influence function based variance estimation and missing data issues in case-cohort studies.
Mark, S D; Katki, H
2001-12-01
Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.
The Three-Dimensional Power Spectrum Of Galaxies from the Sloan Digital Sky Survey
2004-05-10
aspects of the three-dimensional clustering of a much larger data set involving over 200,000 galaxies with redshifts. This paper is focused on measuring... papers , we will constrain galaxy bias empirically by using clustering measurements on smaller scales (e.g., I. Zehavi et al. 2004, in preparation...minimum-variance measurements in 22 k-bands of both the clustering power and its anisotropy due to redshift-space distortions, with narrow and well
NASA Astrophysics Data System (ADS)
Wang, Feng; Yang, Dongkai; Zhang, Bo; Li, Weiqiang
2018-03-01
This paper explores two types of mathematical functions to fit single- and full-frequency waveform of spaceborne Global Navigation Satellite System-Reflectometry (GNSS-R), respectively. The metrics of the waveforms, such as the noise floor, peak magnitude, mid-point position of the leading edge, leading edge slope and trailing edge slope, can be derived from the parameters of the proposed models. Because the quality of the UK TDS-1 data is not at the level required by remote sensing mission, the waveforms buried in noise or from ice/land are removed by defining peak-to-mean ratio, cosine similarity of the waveform before wind speed are retrieved. The single-parameter retrieval models are developed by comparing the peak magnitude, leading edge slope and trailing edge slope derived from the parameters of the proposed models with in situ wind speed from the ASCAT scatterometer. To improve the retrieval accuracy, three types of multi-parameter observations based on the principle component analysis (PCA), minimum variance (MV) estimator and Back Propagation (BP) network are implemented. The results indicate that compared to the best results of the single-parameter observation, the approaches based on the principle component analysis and minimum variance could not significantly improve retrieval accuracy, however, the BP networks obtain improvement with the RMSE of 2.55 m/s and 2.53 m/s for single- and full-frequency waveform, respectively.
Martin, Lynn; Fries, Brant E; Hirdes, John P; James, Mary
2011-06-01
Since 1991, the Minimum Data Set 2.0 (MDS 2.0) has been the mandated assessment in US nursing homes. The Resource Utilization Groups III (RUG-III) case-mix system provides person-specific means of allocating resources based on the variable costs of caring for persons with different needs. Retrospective analyses of data collected on a sample of 9707 nursing home residents (2.4% had an intellectual disability) were used to examine the fit of the RUG-III case-mix system for determining the cost of supporting persons with intellectual disability (intellectual disability). The RUG-III system explained 33.3% of the variance in age-weighted nursing time among persons with intellectual disability compared to 29.6% among other residents, making it a good fit among persons with intellectual disability in nursing homes. The RUG-III may also serve as the basis for the development of a classification system that describes the resource intensity of persons with intellectual disability in other settings that provide similar types of support.
Stratified sampling design based on data mining.
Kim, Yeonkook J; Oh, Yoonhwan; Park, Sunghoon; Cho, Sungzoon; Park, Hayoung
2013-09-01
To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the variance explained by the stratification proposed in this study and by conventional stratification to evaluate the performance of the sampling design. We constructed a study database from health insurance claims data and providers' profile data made available to this study by the Health Insurance Review and Assessment Service of South Korea, and population data from Statistics Korea. From our database, we used the data for single specialty clinics or hospitals in two specialties, general surgery and ophthalmology, for the year 2011 in this study. Data mining resulted in five strata in general surgery with two stratification variables, the number of inpatients per specialist and population density of provider location, and five strata in ophthalmology with two stratification variables, the number of inpatients per specialist and number of beds. The percentages of variance in annual changes in the productivity of specialists explained by the stratification in general surgery and ophthalmology were 22% and 8%, respectively, whereas conventional stratification by the type of provider location and number of beds explained 2% and 0.2% of variance, respectively. This study demonstrated that data mining methods can be used in designing efficient stratified sampling with variables readily available to the insurer and government; it offers an alternative to the existing stratification method that is widely used in healthcare provider surveys in South Korea.
Decker, Sheila A; Culp, Kennith R; Cacchione, Pamela Z
2009-06-01
Chronic pain, mainly associated with musculoskeletal diagnoses, is inadequately and often inappropriately treated in nursing home residents. The purpose of this descriptive study is to identify the musculoskeletal diagnoses associated with pain and to compare pain management of a sample of nursing home residents with the 1998 evidence-based guideline proposed by the American Geriatrics Society (AGS). The sample consists of 215 residents from 13 rural Iowa nursing home homes. The residents answered a series of face-to-face questions that addressed the presence/absence of pain and completed the Mini Mental State Examination (MMSE). Data on pain were abstracted from the Minimum Data Set (MDS). Analyses included descriptive statistics, cross tabulations, and one-way analysis of variance. Residents' responses to the face-to-face pain questions yielded higher rates of pain compared with the MDS pain data. Resident records showed that acetaminophen was the most frequently administered analgesic medication (30.9%). Propoxyphene, not an AGS-recommended opioid, was also prescribed for 23 residents (10.7%). Of the 70 residents (32.6%) expressing daily pain, 23 (32.9%) received no scheduled or pro re nata analgesics. There was no significant difference between MMSE scores and number of scheduled analgesics. Additionally, residents' self-reported use of topical agents was not documented in the charts. The findings suggest that the 1998 AGS evidence-based guideline for the management of chronic pain is inconsistently implemented.
Poston, Brach; Van Gemmert, Arend W.A.; Sharma, Siddharth; Chakrabarti, Somesh; Zavaremi, Shahrzad H.; Stelmach, George
2013-01-01
The minimum variance theory proposes that motor commands are corrupted by signal-dependent noise and smooth trajectories with low noise levels are selected to minimize endpoint error and endpoint variability. The purpose of the study was to determine the contribution of trajectory smoothness to the endpoint accuracy and endpoint variability of rapid multi-joint arm movements. Young and older adults performed arm movements (4 blocks of 25 trials) as fast and as accurately as possible to a target with the right (dominant) arm. Endpoint accuracy and endpoint variability along with trajectory smoothness and error were quantified for each block of trials. Endpoint error and endpoint variance were greater in older adults compared with young adults, but decreased at a similar rate with practice for the two age groups. The greater endpoint error and endpoint variance exhibited by older adults were primarily due to impairments in movement extent control and not movement direction control. The normalized jerk was similar for the two age groups, but was not strongly associated with endpoint error or endpoint variance for either group. However, endpoint variance was strongly associated with endpoint error for both the young and older adults. Finally, trajectory error was similar for both groups and was weakly associated with endpoint error for the older adults. The findings are not consistent with the predictions of the minimum variance theory, but support and extend previous observations that movement trajectories and endpoints are planned independently. PMID:23584101
Uncertainty in Population Estimates for Endangered Animals and Improving the Recovery Process.
Haines, Aaron M; Zak, Matthew; Hammond, Katie; Scott, J Michael; Goble, Dale D; Rachlow, Janet L
2013-08-13
United States recovery plans contain biological information for a species listed under the Endangered Species Act and specify recovery criteria to provide basis for species recovery. The objective of our study was to evaluate whether recovery plans provide uncertainty (e.g., variance) with estimates of population size. We reviewed all finalized recovery plans for listed terrestrial vertebrate species to record the following data: (1) if a current population size was given, (2) if a measure of uncertainty or variance was associated with current estimates of population size and (3) if population size was stipulated for recovery. We found that 59% of completed recovery plans specified a current population size, 14.5% specified a variance for the current population size estimate and 43% specified population size as a recovery criterion. More recent recovery plans reported more estimates of current population size, uncertainty and population size as a recovery criterion. Also, bird and mammal recovery plans reported more estimates of population size and uncertainty compared to reptiles and amphibians. We suggest the use of calculating minimum detectable differences to improve confidence when delisting endangered animals and we identified incentives for individuals to get involved in recovery planning to improve access to quantitative data.
Holmes, Tyson H; He, Xiao-Song
2016-10-01
Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n,1
Holmes, Tyson H.; He, Xiao-Song
2016-01-01
Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n, 1 < n < 50, of human participants for the purpose of estimating many parameters p, such that n < p < 1,000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing use of out-of-sample information for conducting statistical inference. That allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish age-related differences in immune features from changes in immune features caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by quantity of IIV-specific IgA antibody-secreting cells and quantity of IIV-specific IgG antibody-secreting cells. PMID:27196789
NASA Astrophysics Data System (ADS)
Mozaffarzadeh, Moein; Mahloojifar, Ali; Orooji, Mahdi; Kratkiewicz, Karl; Adabi, Saba; Nasiriavanaki, Mohammadreza
2018-02-01
In photoacoustic imaging, delay-and-sum (DAS) beamformer is a common beamforming algorithm having a simple implementation. However, it results in a poor resolution and high sidelobes. To address these challenges, a new algorithm namely delay-multiply-and-sum (DMAS) was introduced having lower sidelobes compared to DAS. To improve the resolution of DMAS, a beamformer is introduced using minimum variance (MV) adaptive beamforming combined with DMAS, so-called minimum variance-based DMAS (MVB-DMAS). It is shown that expanding the DMAS equation results in multiple terms representing a DAS algebra. It is proposed to use the MV adaptive beamformer instead of the existing DAS. MVB-DMAS is evaluated numerically and experimentally. In particular, at the depth of 45 mm MVB-DMAS results in about 31, 18, and 8 dB sidelobes reduction compared to DAS, MV, and DMAS, respectively. The quantitative results of the simulations show that MVB-DMAS leads to improvement in full-width-half-maximum about 96%, 94%, and 45% and signal-to-noise ratio about 89%, 15%, and 35% compared to DAS, DMAS, MV, respectively. In particular, at the depth of 33 mm of the experimental images, MVB-DMAS results in about 20 dB sidelobes reduction in comparison with other beamformers.
Method for Automatic Selection of Parameters in Normal Tissue Complication Probability Modeling.
Christophides, Damianos; Appelt, Ane L; Gusnanto, Arief; Lilley, John; Sebag-Montefiore, David
2018-07-01
To present a fully automatic method to generate multiparameter normal tissue complication probability (NTCP) models and compare its results with those of a published model, using the same patient cohort. Data were analyzed from 345 rectal cancer patients treated with external radiation therapy to predict the risk of patients developing grade 1 or ≥2 cystitis. In total, 23 clinical factors were included in the analysis as candidate predictors of cystitis. Principal component analysis was used to decompose the bladder dose-volume histogram into 8 principal components, explaining more than 95% of the variance. The data set of clinical factors and principal components was divided into training (70%) and test (30%) data sets, with the training data set used by the algorithm to compute an NTCP model. The first step of the algorithm was to obtain a bootstrap sample, followed by multicollinearity reduction using the variance inflation factor and genetic algorithm optimization to determine an ordinal logistic regression model that minimizes the Bayesian information criterion. The process was repeated 100 times, and the model with the minimum Bayesian information criterion was recorded on each iteration. The most frequent model was selected as the final "automatically generated model" (AGM). The published model and AGM were fitted on the training data sets, and the risk of cystitis was calculated. The 2 models had no significant differences in predictive performance, both for the training and test data sets (P value > .05) and found similar clinical and dosimetric factors as predictors. Both models exhibited good explanatory performance on the training data set (P values > .44), which was reduced on the test data sets (P values < .05). The predictive value of the AGM is equivalent to that of the expert-derived published model. It demonstrates potential in saving time, tackling problems with a large number of parameters, and standardizing variable selection in NTCP modeling. Crown Copyright © 2018. Published by Elsevier Inc. All rights reserved.
Simulation Study Using a New Type of Sample Variance
NASA Technical Reports Server (NTRS)
Howe, D. A.; Lainson, K. J.
1996-01-01
We evaluate with simulated data a new type of sample variance for the characterization of frequency stability. The new statistic (referred to as TOTALVAR and its square root TOTALDEV) is a better predictor of long-term frequency variations than the present sample Allan deviation. The statistical model uses the assumption that a time series of phase or frequency differences is wrapped (periodic) with overall frequency difference removed. We find that the variability at long averaging times is reduced considerably for the five models of power-law noise commonly encountered with frequency standards and oscillators.
SU-D-218-05: Material Quantification in Spectral X-Ray Imaging: Optimization and Validation.
Nik, S J; Thing, R S; Watts, R; Meyer, J
2012-06-01
To develop and validate a multivariate statistical method to optimize scanning parameters for material quantification in spectral x-rayimaging. An optimization metric was constructed by extensively sampling the thickness space for the expected number of counts for m (two or three) materials. This resulted in an m-dimensional confidence region ofmaterial quantities, e.g. thicknesses. Minimization of the ellipsoidal confidence region leads to the optimization of energy bins. For the given spectrum, the minimum counts required for effective material separation can be determined by predicting the signal-to-noise ratio (SNR) of the quantification. A Monte Carlo (MC) simulation framework using BEAM was developed to validate the metric. Projection data of the m-materials was generated and material decomposition was performed for combinations of iodine, calcium and water by minimizing the z-score between the expected spectrum and binned measurements. The mean square error (MSE) and variance were calculated to measure the accuracy and precision of this approach, respectively. The minimum MSE corresponds to the optimal energy bins in the BEAM simulations. In the optimization metric, this is equivalent to the smallest confidence region. The SNR of the simulated images was also compared to the predictions from the metric. TheMSE was dominated by the variance for the given material combinations,which demonstrates accurate material quantifications. The BEAMsimulations revealed that the optimization of energy bins was accurate to within 1keV. The SNRs predicted by the optimization metric yielded satisfactory agreement but were expectedly higher for the BEAM simulations due to the inclusion of scattered radiation. The validation showed that the multivariate statistical method provides accurate material quantification, correct location of optimal energy bins and adequateprediction of image SNR. The BEAM code system is suitable for generating spectral x- ray imaging simulations. © 2012 American Association of Physicists in Medicine.
Analysis of Variance with Summary Statistics in Microsoft® Excel®
ERIC Educational Resources Information Center
Larson, David A.; Hsu, Ko-Cheng
2010-01-01
Students regularly are asked to solve Single Factor Analysis of Variance problems given only the sample summary statistics (number of observations per category, category means, and corresponding category standard deviations). Most undergraduate students today use Excel for data analysis of this type. However, Excel, like all other statistical…
Health beliefs of blue collar workers. Increasing self efficacy and removing barriers.
Wilson, S; Sisk, R J; Baldwin, K A
1997-05-01
The study compared the health beliefs of participants and non-participants in a blood pressure and cholesterol screening held at the worksite. A cross sectional, ex-post facto design was used. Questionnaires measuring health beliefs related to cardiac screening and prevention of cardiac problems were distributed to a convenience sample of 200 blue-collar workers in a large manufacturing plant in the Midwest. One hundred fifty-one (75.5%) completed questionnaires were returned, of which 45 had participated in cardiac worksite screening in the past month. A multivariate analysis of variance was used to analyze data. Participants perceived significantly fewer barriers to cardiac screening and scored significantly higher on self efficacy than non-participants. These findings concur with other studies identifying barriers and self efficacy as important predictors of health behavior. Occupational health nurses' efforts are warranted to reduce barriers and improve self efficacy by advertising screenings, scheduling them at convenient times and locations, assuring privacy, and keeping time inconvenience to a minimum.
Demodulation of messages received with low signal to noise ratio
NASA Astrophysics Data System (ADS)
Marguinaud, A.; Quignon, T.; Romann, B.
The implementation of this all-digital demodulator is derived from maximum likelihood considerations applied to an analytical representation of the received signal. Traditional adapted filters and phase lock loops are replaced by minimum variance estimators and hypothesis tests. These statistical tests become very simple when working on phase signal. These methods, combined with rigorous control data representation allow significant computation savings as compared to conventional realizations. Nominal operation has been verified down to energetic signal over noise of -3 dB upon a QPSK demodulator.
An adaptive technique for estimating the atmospheric density profile during the AE mission
NASA Technical Reports Server (NTRS)
Argentiero, P.
1973-01-01
A technique is presented for processing accelerometer data obtained during the AE missions in order to estimate the atmospheric density profile. A minimum variance, adaptive filter is utilized. The trajectory of the probe and probe parameters are in a consider mode where their estimates are unimproved but their associated uncertainties are permitted an impact on filter behavior. Simulations indicate that the technique is effective in estimating a density profile to within a few percentage points.
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.
Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E
2015-09-03
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Statistical indicators of collective behavior and functional clusters in gene networks of yeast
NASA Astrophysics Data System (ADS)
Živković, J.; Tadić, B.; Wick, N.; Thurner, S.
2006-03-01
We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
A new variance stabilizing transformation for gene expression data analysis.
Kelmansky, Diana M; Martínez, Elena J; Leiva, Víctor
2013-12-01
In this paper, we introduce a new family of power transformations, which has the generalized logarithm as one of its members, in the same manner as the usual logarithm belongs to the family of Box-Cox power transformations. Although the new family has been developed for analyzing gene expression data, it allows a wider scope of mean-variance related data to be reached. We study the analytical properties of the new family of transformations, as well as the mean-variance relationships that are stabilized by using its members. We propose a methodology based on this new family, which includes a simple strategy for selecting the family member adequate for a data set. We evaluate the finite sample behavior of different classical and robust estimators based on this strategy by Monte Carlo simulations. We analyze real genomic data by using the proposed transformation to empirically show how the new methodology allows the variance of these data to be stabilized.
Kiong, Tiong Sieh; Salem, S. Balasem; Paw, Johnny Koh Siaw; Sankar, K. Prajindra
2014-01-01
In smart antenna applications, the adaptive beamforming technique is used to cancel interfering signals (placing nulls) and produce or steer a strong beam toward the target signal according to the calculated weight vectors. Minimum variance distortionless response (MVDR) beamforming is capable of determining the weight vectors for beam steering; however, its nulling level on the interference sources remains unsatisfactory. Beamforming can be considered as an optimization problem, such that optimal weight vector should be obtained through computation. Hence, in this paper, a new dynamic mutated artificial immune system (DM-AIS) is proposed to enhance MVDR beamforming for controlling the null steering of interference and increase the signal to interference noise ratio (SINR) for wanted signals. PMID:25003136
Kiong, Tiong Sieh; Salem, S Balasem; Paw, Johnny Koh Siaw; Sankar, K Prajindra; Darzi, Soodabeh
2014-01-01
In smart antenna applications, the adaptive beamforming technique is used to cancel interfering signals (placing nulls) and produce or steer a strong beam toward the target signal according to the calculated weight vectors. Minimum variance distortionless response (MVDR) beamforming is capable of determining the weight vectors for beam steering; however, its nulling level on the interference sources remains unsatisfactory. Beamforming can be considered as an optimization problem, such that optimal weight vector should be obtained through computation. Hence, in this paper, a new dynamic mutated artificial immune system (DM-AIS) is proposed to enhance MVDR beamforming for controlling the null steering of interference and increase the signal to interference noise ratio (SINR) for wanted signals.
Seabed mapping and characterization of sediment variability using the usSEABED data base
Goff, J.A.; Jenkins, C.J.; Jeffress, Williams S.
2008-01-01
We present a methodology for statistical analysis of randomly located marine sediment point data, and apply it to the US continental shelf portions of usSEABED mean grain size records. The usSEABED database, like many modern, large environmental datasets, is heterogeneous and interdisciplinary. We statistically test the database as a source of mean grain size data, and from it provide a first examination of regional seafloor sediment variability across the entire US continental shelf. Data derived from laboratory analyses ("extracted") and from word-based descriptions ("parsed") are treated separately, and they are compared statistically and deterministically. Data records are selected for spatial analysis by their location within sample regions: polygonal areas defined in ArcGIS chosen by geography, water depth, and data sufficiency. We derive isotropic, binned semivariograms from the data, and invert these for estimates of noise variance, field variance, and decorrelation distance. The highly erratic nature of the semivariograms is a result both of the random locations of the data and of the high level of data uncertainty (noise). This decorrelates the data covariance matrix for the inversion, and largely prevents robust estimation of the fractal dimension. Our comparison of the extracted and parsed mean grain size data demonstrates important differences between the two. In particular, extracted measurements generally produce finer mean grain sizes, lower noise variance, and lower field variance than parsed values. Such relationships can be used to derive a regionally dependent conversion factor between the two. Our analysis of sample regions on the US continental shelf revealed considerable geographic variability in the estimated statistical parameters of field variance and decorrelation distance. Some regional relationships are evident, and overall there is a tendency for field variance to be higher where the average mean grain size is finer grained. Surprisingly, parsed and extracted noise magnitudes correlate with each other, which may indicate that some portion of the data variability that we identify as "noise" is caused by real grain size variability at very short scales. Our analyses demonstrate that by applying a bias-correction proxy, usSEABED data can be used to generate reliable interpolated maps of regional mean grain size and sediment character.
Stratified Sampling Design Based on Data Mining
Kim, Yeonkook J.; Oh, Yoonhwan; Park, Sunghoon; Cho, Sungzoon
2013-01-01
Objectives To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. Methods We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the variance explained by the stratification proposed in this study and by conventional stratification to evaluate the performance of the sampling design. We constructed a study database from health insurance claims data and providers' profile data made available to this study by the Health Insurance Review and Assessment Service of South Korea, and population data from Statistics Korea. From our database, we used the data for single specialty clinics or hospitals in two specialties, general surgery and ophthalmology, for the year 2011 in this study. Results Data mining resulted in five strata in general surgery with two stratification variables, the number of inpatients per specialist and population density of provider location, and five strata in ophthalmology with two stratification variables, the number of inpatients per specialist and number of beds. The percentages of variance in annual changes in the productivity of specialists explained by the stratification in general surgery and ophthalmology were 22% and 8%, respectively, whereas conventional stratification by the type of provider location and number of beds explained 2% and 0.2% of variance, respectively. Conclusions This study demonstrated that data mining methods can be used in designing efficient stratified sampling with variables readily available to the insurer and government; it offers an alternative to the existing stratification method that is widely used in healthcare provider surveys in South Korea. PMID:24175117
25 CFR 542.18 - How does a gaming operation apply for a variance from the standards of the part?
Code of Federal Regulations, 2010 CFR
2010-04-01
... 25 Indians 2 2010-04-01 2010-04-01 false How does a gaming operation apply for a variance from the standards of the part? 542.18 Section 542.18 Indians NATIONAL INDIAN GAMING COMMISSION, DEPARTMENT OF THE INTERIOR HUMAN SERVICES MINIMUM INTERNAL CONTROL STANDARDS § 542.18 How does a gaming operation apply for a...
NASA Technical Reports Server (NTRS)
Wolf, Michael
2012-01-01
A document describes an algorithm created to estimate the mass placed on a sample verification sensor (SVS) designed for lunar or planetary robotic sample return missions. A novel SVS measures the capacitance between a rigid bottom plate and an elastic top membrane in seven locations. As additional sample material (soil and/or small rocks) is placed on the top membrane, the deformation of the membrane increases the capacitance. The mass estimation algorithm addresses both the calibration of each SVS channel, and also addresses how to combine the capacitances read from each of the seven channels into a single mass estimate. The probabilistic approach combines the channels according to the variance observed during the training phase, and provides not only the mass estimate, but also a value for the certainty of the estimate. SVS capacitance data is collected for known masses under a wide variety of possible loading scenarios, though in all cases, the distribution of sample within the canister is expected to be approximately uniform. A capacitance-vs-mass curve is fitted to this data, and is subsequently used to determine the mass estimate for the single channel s capacitance reading during the measurement phase. This results in seven different mass estimates, one for each SVS channel. Moreover, the variance of the calibration data is used to place a Gaussian probability distribution function (pdf) around this mass estimate. To blend these seven estimates, the seven pdfs are combined into a single Gaussian distribution function, providing the final mean and variance of the estimate. This blending technique essentially takes the final estimate as an average of the estimates of the seven channels, weighted by the inverse of the channel s variance.
NASA Astrophysics Data System (ADS)
Chen, Sang; Hoffmann, Sharon S.; Lund, David C.; Cobb, Kim M.; Emile-Geay, Julien; Adkins, Jess F.
2016-05-01
The El Niño-Southern Oscillation (ENSO) is the primary driver of interannual climate variability in the tropics and subtropics. Despite substantial progress in understanding ocean-atmosphere feedbacks that drive ENSO today, relatively little is known about its behavior on centennial and longer timescales. Paleoclimate records from lakes, corals, molluscs and deep-sea sediments generally suggest that ENSO variability was weaker during the mid-Holocene (4-6 kyr BP) than the late Holocene (0-4 kyr BP). However, discrepancies amongst the records preclude a clear timeline of Holocene ENSO evolution and therefore the attribution of ENSO variability to specific climate forcing mechanisms. Here we present δ18 O results from a U-Th dated speleothem in Malaysian Borneo sampled at sub-annual resolution. The δ18 O of Borneo rainfall is a robust proxy of regional convective intensity and precipitation amount, both of which are directly influenced by ENSO activity. Our estimates of stalagmite δ18 O variance at ENSO periods (2-7 yr) show a significant reduction in interannual variability during the mid-Holocene (3240-3380 and 5160-5230 yr BP) relative to both the late Holocene (2390-2590 yr BP) and early Holocene (6590-6730 yr BP). The Borneo results are therefore inconsistent with lacustrine records of ENSO from the eastern equatorial Pacific that show little or no ENSO variance during the early Holocene. Instead, our results support coral, mollusc and foraminiferal records from the central and eastern equatorial Pacific that show a mid-Holocene minimum in ENSO variance. Reduced mid-Holocene interannual δ18 O variability in Borneo coincides with an overall minimum in mean δ18 O from 3.5 to 5.5 kyr BP. Persistent warm pool convection would tend to enhance the Walker circulation during the mid-Holocene, which likely contributed to reduced ENSO variance during this period. This finding implies that both convective intensity and interannual variability in Borneo are driven by coupled air-sea dynamics that are sensitive to precessional insolation forcing. Isolating the exact mechanisms that drive long-term ENSO evolution will require additional high-resolution paleoclimatic reconstructions and further investigation of Holocene tropical climate evolution using coupled climate models.
Technical and biological variance structure in mRNA-Seq data: life in the real world
2012-01-01
Background mRNA expression data from next generation sequencing platforms is obtained in the form of counts per gene or exon. Counts have classically been assumed to follow a Poisson distribution in which the variance is equal to the mean. The Negative Binomial distribution which allows for over-dispersion, i.e., for the variance to be greater than the mean, is commonly used to model count data as well. Results In mRNA-Seq data from 25 subjects, we found technical variation to generally follow a Poisson distribution as has been reported previously and biological variability was over-dispersed relative to the Poisson model. The mean-variance relationship across all genes was quadratic, in keeping with a Negative Binomial (NB) distribution. Over-dispersed Poisson and NB distributional assumptions demonstrated marked improvements in goodness-of-fit (GOF) over the standard Poisson model assumptions, but with evidence of over-fitting in some genes. Modeling of experimental effects improved GOF for high variance genes but increased the over-fitting problem. Conclusions These conclusions will guide development of analytical strategies for accurate modeling of variance structure in these data and sample size determination which in turn will aid in the identification of true biological signals that inform our understanding of biological systems. PMID:22769017
Xu, Chonggang; Gertner, George
2013-01-01
Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements. PMID:24143037
Xu, Chonggang; Gertner, George
2011-01-01
Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements.
Network Structure and Biased Variance Estimation in Respondent Driven Sampling
Verdery, Ashton M.; Mouw, Ted; Bauldry, Shawn; Mucha, Peter J.
2015-01-01
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network. PMID:26679927
Kappa statistic for clustered matched-pair data.
Yang, Zhao; Zhou, Ming
2014-07-10
Kappa statistic is widely used to assess the agreement between two procedures in the independent matched-pair data. For matched-pair data collected in clusters, on the basis of the delta method and sampling techniques, we propose a nonparametric variance estimator for the kappa statistic without within-cluster correlation structure or distributional assumptions. The results of an extensive Monte Carlo simulation study demonstrate that the proposed kappa statistic provides consistent estimation and the proposed variance estimator behaves reasonably well for at least a moderately large number of clusters (e.g., K ≥50). Compared with the variance estimator ignoring dependence within a cluster, the proposed variance estimator performs better in maintaining the nominal coverage probability when the intra-cluster correlation is fair (ρ ≥0.3), with more pronounced improvement when ρ is further increased. To illustrate the practical application of the proposed estimator, we analyze two real data examples of clustered matched-pair data. Copyright © 2014 John Wiley & Sons, Ltd.
A New Method for Estimating the Effective Population Size from Allele Frequency Changes
Pollak, Edward
1983-01-01
A new procedure is proposed for estimating the effective population size, given that information is available on changes in frequencies of the alleles at one or more independently segregating loci and the population is observed at two or more separate times. Approximate expressions are obtained for the variances of the new statistic, as well as others, also based on allele frequency changes, that have been discussed in the literature. This analysis indicates that the new statistic will generally have a smaller variance than the others. Estimates of effective population sizes and of the standard errors of the estimates are computed for data on two fly populations that have been discussed in earlier papers. In both cases, there is evidence that the effective population size is very much smaller than the minimum census size of the population. PMID:17246147
Estimating gene function with least squares nonnegative matrix factorization.
Wang, Guoli; Ochs, Michael F
2007-01-01
Nonnegative matrix factorization is a machine learning algorithm that has extracted information from data in a number of fields, including imaging and spectral analysis, text mining, and microarray data analysis. One limitation with the method for linking genes through microarray data in order to estimate gene function is the high variance observed in transcription levels between different genes. Least squares nonnegative matrix factorization uses estimates of the uncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to a local minimum in normalized chi2, rather than a Euclidean distance or divergence between the reconstructed data and the data itself. Herein, application of this method to microarray data is demonstrated in order to predict gene function.
NASA Technical Reports Server (NTRS)
Deloach, Richard; Obara, Clifford J.; Goodman, Wesley L.
2012-01-01
This paper documents a check standard wind tunnel test conducted in the Langley 0.3-Meter Transonic Cryogenic Tunnel (0.3M TCT) that was designed and analyzed using the Modern Design of Experiments (MDOE). The test designed to partition the unexplained variance of typical wind tunnel data samples into two constituent components, one attributable to ordinary random error, and one attributable to systematic error induced by covariate effects. Covariate effects in wind tunnel testing are discussed, with examples. The impact of systematic (non-random) unexplained variance on the statistical independence of sequential measurements is reviewed. The corresponding correlation among experimental errors is discussed, as is the impact of such correlation on experimental results generally. The specific experiment documented herein was organized as a formal test for the presence of unexplained variance in representative samples of wind tunnel data, in order to quantify the frequency with which such systematic error was detected, and its magnitude relative to ordinary random error. Levels of systematic and random error reported here are representative of those quantified in other facilities, as cited in the references.
Allan Variance Calculation for Nonuniformly Spaced Input Data
2015-01-01
τ (tau). First, the set of gyro values is partitioned into bins of duration τ. For example, if the sampling duration τ is 2 sec and there are 4,000...Variance Calculation For each value of τ, the conventional AV calculation partitions the gyro data sets into bins with approximately τ / Δt...value of Δt. Therefore, a new way must be found to partition the gyro data sets into bins. The basic concept behind the modified AV calculation is
[Theory, method and application of method R on estimation of (co)variance components].
Liu, Wen-Zhong
2004-07-01
Theory, method and application of Method R on estimation of (co)variance components were reviewed in order to make the method be reasonably used. Estimation requires R values,which are regressions of predicted random effects that are calculated using complete dataset on predicted random effects that are calculated using random subsets of the same data. By using multivariate iteration algorithm based on a transformation matrix,and combining with the preconditioned conjugate gradient to solve the mixed model equations, the computation efficiency of Method R is much improved. Method R is computationally inexpensive,and the sampling errors and approximate credible intervals of estimates can be obtained. Disadvantages of Method R include a larger sampling variance than other methods for the same data,and biased estimates in small datasets. As an alternative method, Method R can be used in larger datasets. It is necessary to study its theoretical properties and broaden its application range further.
Mozaffarzadeh, Moein; Mahloojifar, Ali; Orooji, Mahdi; Kratkiewicz, Karl; Adabi, Saba; Nasiriavanaki, Mohammadreza
2018-02-01
In photoacoustic imaging, delay-and-sum (DAS) beamformer is a common beamforming algorithm having a simple implementation. However, it results in a poor resolution and high sidelobes. To address these challenges, a new algorithm namely delay-multiply-and-sum (DMAS) was introduced having lower sidelobes compared to DAS. To improve the resolution of DMAS, a beamformer is introduced using minimum variance (MV) adaptive beamforming combined with DMAS, so-called minimum variance-based DMAS (MVB-DMAS). It is shown that expanding the DMAS equation results in multiple terms representing a DAS algebra. It is proposed to use the MV adaptive beamformer instead of the existing DAS. MVB-DMAS is evaluated numerically and experimentally. In particular, at the depth of 45 mm MVB-DMAS results in about 31, 18, and 8 dB sidelobes reduction compared to DAS, MV, and DMAS, respectively. The quantitative results of the simulations show that MVB-DMAS leads to improvement in full-width-half-maximum about 96%, 94%, and 45% and signal-to-noise ratio about 89%, 15%, and 35% compared to DAS, DMAS, MV, respectively. In particular, at the depth of 33 mm of the experimental images, MVB-DMAS results in about 20 dB sidelobes reduction in comparison with other beamformers. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Wickenberg-Bolin, Ulrika; Göransson, Hanna; Fryknäs, Mårten; Gustafsson, Mats G; Isaksson, Anders
2006-03-13
Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a refined algorithm called Repeated Independent Design and Test (RIDT). Our simulations reveal that repeated designs and tests based on resampling in a fixed bag of samples yield a biased variance estimate. We also demonstrate that it is possible to obtain an improved variance estimate by means of a procedure that explicitly models how this bias depends on the number of samples used for testing. For the special case of repeated designs and tests using new samples for each design and test, we present an exact analytical expression for how the expected value of the bias decreases with the size of the test set. We show that via modeling and subsequent reduction of the small sample bias, it is possible to obtain an improved estimate of the variance of classifier performance between design sets. However, the uncertainty of the variance estimate is large in the simulations performed indicating that the method in its present form cannot be directly applied to small data sets.
Vegetation greenness impacts on maximum and minimum temperatures in northeast Colorado
Hanamean, J. R.; Pielke, R.A.; Castro, C. L.; Ojima, D.S.; Reed, Bradley C.; Gao, Z.
2003-01-01
The impact of vegetation on the microclimate has not been adequately considered in the analysis of temperature forecasting and modelling. To fill part of this gap, the following study was undertaken.A daily 850–700 mb layer mean temperature, computed from the National Center for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis, and satellite-derived greenness values, as defined by NDVI (Normalised Difference Vegetation Index), were correlated with surface maximum and minimum temperatures at six sites in northeast Colorado for the years 1989–98. The NDVI values, representing landscape greenness, act as a proxy for latent heat partitioning via transpiration. These sites encompass a wide array of environments, from irrigated-urban to short-grass prairie. The explained variance (r2 value) of surface maximum and minimum temperature by only the 850–700 mb layer mean temperature was subtracted from the corresponding explained variance by the 850–700 mb layer mean temperature and NDVI values. The subtraction shows that by including NDVI values in the analysis, the r2 values, and thus the degree of explanation of the surface temperatures, increase by a mean of 6% for the maxima and 8% for the minima over the period March–October. At most sites, there is a seasonal dependence in the explained variance of the maximum temperatures because of the seasonal cycle of plant growth and senescence. Between individual sites, the highest increase in explained variance occurred at the site with the least amount of anthropogenic influence. This work suggests the vegetation state needs to be included as a factor in surface temperature forecasting, numerical modeling, and climate change assessments.
NASA Astrophysics Data System (ADS)
Zhou, Xiangrong; Kano, Takuya; Cai, Yunliang; Li, Shuo; Zhou, Xinxin; Hara, Takeshi; Yokoyama, Ryujiro; Fujita, Hiroshi
2016-03-01
This paper describes a brand new automatic segmentation method for quantifying volume and density of mammary gland regions on non-contrast CT images. The proposed method uses two processing steps: (1) breast region localization, and (2) breast region decomposition to accomplish a robust mammary gland segmentation task on CT images. The first step detects two minimum bounding boxes of left and right breast regions, respectively, based on a machine-learning approach that adapts to a large variance of the breast appearances on different age levels. The second step divides the whole breast region in each side into mammary gland, fat tissue, and other regions by using spectral clustering technique that focuses on intra-region similarities of each patient and aims to overcome the image variance caused by different scan-parameters. The whole approach is designed as a simple structure with very minimum number of parameters to gain a superior robustness and computational efficiency for real clinical setting. We applied this approach to a dataset of 300 CT scans, which are sampled with the equal number from 30 to 50 years-old-women. Comparing to human annotations, the proposed approach can measure volume and quantify distributions of the CT numbers of mammary gland regions successfully. The experimental results demonstrated that the proposed approach achieves results consistent with manual annotations. Through our proposed framework, an efficient and effective low cost clinical screening scheme may be easily implemented to predict breast cancer risk, especially on those already acquired scans.
ERIC Educational Resources Information Center
Yu, Jiang; Williford, William R.
1991-01-01
Used sample from New York State Driver License File to mathematically extend dimension of file so that data purging procedure exerts minimum influence on calculation of drinking-driving recidivism. Examined impact of dimension of data on recidivism rate and mathematically extended file until impact of data dimension was minimum. Calculated New…
Zheng, Hanrong; Fang, Zujie; Wang, Zhaoyong; Lu, Bin; Cao, Yulong; Ye, Qing; Qu, Ronghui; Cai, Haiwen
2018-01-31
It is a basic task in Brillouin distributed fiber sensors to extract the peak frequency of the scattering spectrum, since the peak frequency shift gives information on the fiber temperature and strain changes. Because of high-level noise, quadratic fitting is often used in the data processing. Formulas of the dependence of the minimum detectable Brillouin frequency shift (BFS) on the signal-to-noise ratio (SNR) and frequency step have been presented in publications, but in different expressions. A detailed deduction of new formulas of BFS variance and its average is given in this paper, showing especially their dependences on the data range used in fitting, including its length and its center respective to the real spectral peak. The theoretical analyses are experimentally verified. It is shown that the center of the data range has a direct impact on the accuracy of the extracted BFS. We propose and demonstrate an iterative fitting method to mitigate such effects and improve the accuracy of BFS measurement. The different expressions of BFS variances presented in previous papers are explained and discussed.
Change in mean temperature as a predictor of extreme temperature change in the Asia-Pacific region
NASA Astrophysics Data System (ADS)
Griffiths, G. M.; Chambers, L. E.; Haylock, M. R.; Manton, M. J.; Nicholls, N.; Baek, H.-J.; Choi, Y.; della-Marta, P. M.; Gosai, A.; Iga, N.; Lata, R.; Laurent, V.; Maitrepierre, L.; Nakamigawa, H.; Ouprasitwong, N.; Solofa, D.; Tahani, L.; Thuy, D. T.; Tibig, L.; Trewin, B.; Vediapan, K.; Zhai, P.
2005-08-01
Trends (1961-2003) in daily maximum and minimum temperatures, extremes and variance were found to be spatially coherent across the Asia-Pacific region. The majority of stations exhibited significant trends: increases in mean maximum and mean minimum temperature, decreases in cold nights and cool days, and increases in warm nights. No station showed a significant increase in cold days or cold nights, but a few sites showed significant decreases in hot days and warm nights. Significant decreases were observed in both maximum and minimum temperature standard deviation in China, Korea and some stations in Japan (probably reflecting urbanization effects), but also for some Thailand and coastal Australian sites. The South Pacific convergence zone (SPCZ) region between Fiji and the Solomon Islands showed a significant increase in maximum temperature variability.Correlations between mean temperature and the frequency of extreme temperatures were strongest in the tropical Pacific Ocean from French Polynesia to Papua New Guinea, Malaysia, the Philippines, Thailand and southern Japan. Correlations were weaker at continental or higher latitude locations, which may partly reflect urbanization.For non-urban stations, the dominant distribution change for both maximum and minimum temperature involved a change in the mean, impacting on one or both extremes, with no change in standard deviation. This occurred from French Polynesia to Papua New Guinea (except for maximum temperature changes near the SPCZ), in Malaysia, the Philippines, and several outlying Japanese islands. For urbanized stations the dominant change was a change in the mean and variance, impacting on one or both extremes. This result was particularly evident for minimum temperature.The results presented here, for non-urban tropical and maritime locations in the Asia-Pacific region, support the hypothesis that changes in mean temperature may be used to predict changes in extreme temperatures. At urbanized or higher latitude locations, changes in variance should be incorporated.
Wu, Wenzheng; Ye, Wenli; Wu, Zichao; Geng, Peng; Wang, Yulei; Zhao, Ji
2017-01-01
The success of the 3D-printing process depends upon the proper selection of process parameters. However, the majority of current related studies focus on the influence of process parameters on the mechanical properties of the parts. The influence of process parameters on the shape-memory effect has been little studied. This study used the orthogonal experimental design method to evaluate the influence of the layer thickness H, raster angle θ, deformation temperature Td and recovery temperature Tr on the shape-recovery ratio Rr and maximum shape-recovery rate Vm of 3D-printed polylactic acid (PLA). The order and contribution of every experimental factor on the target index were determined by range analysis and ANOVA, respectively. The experimental results indicated that the recovery temperature exerted the greatest effect with a variance ratio of 416.10, whereas the layer thickness exerted the smallest effect on the shape-recovery ratio with a variance ratio of 4.902. The recovery temperature exerted the most significant effect on the maximum shape-recovery rate with the highest variance ratio of 1049.50, whereas the raster angle exerted the minimum effect with a variance ratio of 27.163. The results showed that the shape-memory effect of 3D-printed PLA parts depended strongly on recovery temperature, and depended more weakly on the deformation temperature and 3D-printing parameters. PMID:28825617
The Impact of Truth Surrogate Variance on Quality Assessment/Assurance in Wind Tunnel Testing
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2016-01-01
Minimum data volume requirements for wind tunnel testing are reviewed and shown to depend on error tolerance, response model complexity, random error variance in the measurement environment, and maximum acceptable levels of inference error risk. Distinctions are made between such related concepts as quality assurance and quality assessment in response surface modeling, as well as between precision and accuracy. Earlier research on the scaling of wind tunnel tests is extended to account for variance in the truth surrogates used at confirmation sites in the design space to validate proposed response models. A model adequacy metric is presented that represents the fraction of the design space within which model predictions can be expected to satisfy prescribed quality specifications. The impact of inference error on the assessment of response model residuals is reviewed. The number of sites where reasonably well-fitted response models actually predict inadequately is shown to be considerably less than the number of sites where residuals are out of tolerance. The significance of such inference error effects on common response model assessment strategies is examined.
Within-Tunnel Variations in Pressure Data for Three Transonic Wind Tunnels
NASA Technical Reports Server (NTRS)
DeLoach, Richard
2014-01-01
This paper compares the results of pressure measurements made on the same test article with the same test matrix in three transonic wind tunnels. A comparison is presented of the unexplained variance associated with polar replicates acquired in each tunnel. The impact of a significance component of systematic (not random) unexplained variance is reviewed, and the results of analyses of variance are presented to assess the degree of significant systematic error in these representative wind tunnel tests. Total uncertainty estimates are reported for 140 samples of pressure data, quantifying the effects of within-polar random errors and between-polar systematic bias errors.
Repeat sample intraocular pressure variance in induced and naturally ocular hypertensive monkeys.
Dawson, William W; Dawson, Judyth C; Hope, George M; Brooks, Dennis E; Percicot, Christine L
2005-12-01
To compare repeat-sample means variance of laser induced ocular hypertension (OH) in rhesus monkeys with the repeat-sample mean variance of natural OH in age-range matched monkeys of similar and dissimilar pedigrees. Multiple monocular, retrospective, intraocular pressure (IOP) measures were recorded repeatedly during a short sampling interval (SSI, 1-5 months) and a long sampling interval (LSI, 6-36 months). There were 5-13 eyes in each SSI and LSI subgroup. Each interval contained subgroups from the Florida with natural hypertension (NHT), induced hypertension (IHT1) Florida monkeys, unrelated (Strasbourg, France) induced hypertensives (IHT2), and Florida age-range matched controls (C). Repeat-sample individual variance means and related IOPs were analyzed by a parametric analysis of variance (ANOV) and results compared to non-parametric Kruskal-Wallis ANOV. As designed, all group intraocular pressure distributions were significantly different (P < or = 0.009) except for the two (Florida/Strasbourg) induced OH groups. A parametric 2 x 4 design ANOV for mean variance showed large significant effects due to treatment group and sampling interval. Similar results were produced by the nonparametric ANOV. Induced OH sample variance (LSI) was 43x the natural OH sample variance-mean. The same relationship for the SSI was 12x. Laser induced ocular hypertension in rhesus monkeys produces large IOP repeat-sample variance mean results compared to controls and natural OH.
Multiple sensitive estimation and optimal sample size allocation in the item sum technique.
Perri, Pier Francesco; Rueda García, María Del Mar; Cobo Rodríguez, Beatriz
2018-01-01
For surveys of sensitive issues in life sciences, statistical procedures can be used to reduce nonresponse and social desirability response bias. Both of these phenomena provoke nonsampling errors that are difficult to deal with and can seriously flaw the validity of the analyses. The item sum technique (IST) is a very recent indirect questioning method derived from the item count technique that seeks to procure more reliable responses on quantitative items than direct questioning while preserving respondents' anonymity. This article addresses two important questions concerning the IST: (i) its implementation when two or more sensitive variables are investigated and efficient estimates of their unknown population means are required; (ii) the determination of the optimal sample size to achieve minimum variance estimates. These aspects are of great relevance for survey practitioners engaged in sensitive research and, to the best of our knowledge, were not studied so far. In this article, theoretical results for multiple estimation and optimal allocation are obtained under a generic sampling design and then particularized to simple random sampling and stratified sampling designs. Theoretical considerations are integrated with a number of simulation studies based on data from two real surveys and conducted to ascertain the efficiency gain derived from optimal allocation in different situations. One of the surveys concerns cannabis consumption among university students. Our findings highlight some methodological advances that can be obtained in life sciences IST surveys when optimal allocation is achieved. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Hu, Jianhua; Wright, Fred A
2007-03-01
The identification of the genes that are differentially expressed in two-sample microarray experiments remains a difficult problem when the number of arrays is very small. We discuss the implications of using ordinary t-statistics and examine other commonly used variants. For oligonucleotide arrays with multiple probes per gene, we introduce a simple model relating the mean and variance of expression, possibly with gene-specific random effects. Parameter estimates from the model have natural shrinkage properties that guard against inappropriately small variance estimates, and the model is used to obtain a differential expression statistic. A limiting value to the positive false discovery rate (pFDR) for ordinary t-tests provides motivation for our use of the data structure to improve variance estimates. Our approach performs well compared to other proposed approaches in terms of the false discovery rate.
Sampling design optimisation for rainfall prediction using a non-stationary geostatistical model
NASA Astrophysics Data System (ADS)
Wadoux, Alexandre M. J.-C.; Brus, Dick J.; Rico-Ramirez, Miguel A.; Heuvelink, Gerard B. M.
2017-09-01
The accuracy of spatial predictions of rainfall by merging rain-gauge and radar data is partly determined by the sampling design of the rain-gauge network. Optimising the locations of the rain-gauges may increase the accuracy of the predictions. Existing spatial sampling design optimisation methods are based on minimisation of the spatially averaged prediction error variance under the assumption of intrinsic stationarity. Over the past years, substantial progress has been made to deal with non-stationary spatial processes in kriging. Various well-documented geostatistical models relax the assumption of stationarity in the mean, while recent studies show the importance of considering non-stationarity in the variance for environmental processes occurring in complex landscapes. We optimised the sampling locations of rain-gauges using an extension of the Kriging with External Drift (KED) model for prediction of rainfall fields. The model incorporates both non-stationarity in the mean and in the variance, which are modelled as functions of external covariates such as radar imagery, distance to radar station and radar beam blockage. Spatial predictions are made repeatedly over time, each time recalibrating the model. The space-time averaged KED variance was minimised by Spatial Simulated Annealing (SSA). The methodology was tested using a case study predicting daily rainfall in the north of England for a one-year period. Results show that (i) the proposed non-stationary variance model outperforms the stationary variance model, and (ii) a small but significant decrease of the rainfall prediction error variance is obtained with the optimised rain-gauge network. In particular, it pays off to place rain-gauges at locations where the radar imagery is inaccurate, while keeping the distribution over the study area sufficiently uniform.
A note on the kappa statistic for clustered dichotomous data.
Zhou, Ming; Yang, Zhao
2014-06-30
The kappa statistic is widely used to assess the agreement between two raters. Motivated by a simulation-based cluster bootstrap method to calculate the variance of the kappa statistic for clustered physician-patients dichotomous data, we investigate its special correlation structure and develop a new simple and efficient data generation algorithm. For the clustered physician-patients dichotomous data, based on the delta method and its special covariance structure, we propose a semi-parametric variance estimator for the kappa statistic. An extensive Monte Carlo simulation study is performed to evaluate the performance of the new proposal and five existing methods with respect to the empirical coverage probability, root-mean-square error, and average width of the 95% confidence interval for the kappa statistic. The variance estimator ignoring the dependence within a cluster is generally inappropriate, and the variance estimators from the new proposal, bootstrap-based methods, and the sampling-based delta method perform reasonably well for at least a moderately large number of clusters (e.g., the number of clusters K ⩾50). The new proposal and sampling-based delta method provide convenient tools for efficient computations and non-simulation-based alternatives to the existing bootstrap-based methods. Moreover, the new proposal has acceptable performance even when the number of clusters is as small as K = 25. To illustrate the practical application of all the methods, one psychiatric research data and two simulated clustered physician-patients dichotomous data are analyzed. Copyright © 2014 John Wiley & Sons, Ltd.
Estimation of stable boundary-layer height using variance processing of backscatter lidar data
NASA Astrophysics Data System (ADS)
Saeed, Umar; Rocadenbosch, Francesc
2017-04-01
Stable boundary layer (SBL) is one of the most complex and less understood topics in atmospheric science. The type and height of the SBL is an important parameter for several applications such as understanding the formation of haze fog, and accuracy of chemical and pollutant dispersion models, etc. [1]. This work addresses nocturnal Stable Boundary-Layer Height (SBLH) estimation by using variance processing and attenuated backscatter lidar measurements, its principles and limitations. It is shown that temporal and spatial variance profiles of the attenuated backscatter signal are related to the stratification of aerosols in the SBL. A minimum variance SBLH estimator using local minima in the variance profiles of backscatter lidar signals is introduced. The method is validated using data from HD(CP)2 Observational Prototype Experiment (HOPE) campaign at Jülich, Germany [2], under different atmospheric conditions. This work has received funding from the European Union Seventh Framework Programme, FP7 People, ITN Marie Curie Actions Programme (2012-2016) in the frame of ITaRS project (GA 289923), H2020 programme under ACTRIS-2 project (GA 654109), the Spanish Ministry of Economy and Competitiveness - European Regional Development Funds under TEC2015-63832-P project, and from the Generalitat de Catalunya (Grup de Recerca Consolidat) 2014-SGR-583. [1] R. B. Stull, An Introduction to Boundary Layer Meteorology, chapter 12, Stable Boundary Layer, pp. 499-543, Springer, Netherlands, 1988. [2] U. Löhnert, J. H. Schween, C. Acquistapace, K. Ebell, M. Maahn, M. Barrera-Verdejo, A. Hirsikko, B. Bohn, A. Knaps, E. O'Connor, C. Simmer, A. Wahner, and S. Crewell, "JOYCE: Jülich Observatory for Cloud Evolution," Bull. Amer. Meteor. Soc., vol. 96, no. 7, pp. 1157-1174, 2015.
ERIC Educational Resources Information Center
Thompson, Bruce
The relationship between analysis of variance (ANOVA) methods and their analogs (analysis of covariance and multiple analyses of variance and covariance--collectively referred to as OVA methods) and the more general analytic case is explored. A small heuristic data set is used, with a hypothetical sample of 20 subjects, randomly assigned to five…
Repeated measurements of mite and pet allergen levels in house dust over a time period of 8 years.
Antens, C J M; Oldenwening, M; Wolse, A; Gehring, U; Smit, H A; Aalberse, R C; Kerkhof, M; Gerritsen, J; de Jongste, J C; Brunekreef, B
2006-12-01
Studies of the association between indoor allergen exposure and the development of allergic diseases have often measured allergen exposure at one point in time. We investigated the variability of house dust mite (Der p 1, Der f 1) and cat (Fel d 1) allergen in Dutch homes over a period of 8 years. Data were obtained in the Dutch PIAMA birth cohort study. Dust from the child's mattress, the parents' mattress and the living room floor was collected at four points in time, when the child was 3 months, 4, 6 and 8 years old. Dust samples were analysed for Der p 1, Der f 1 and Fel d 1 by sandwich enzyme immuno assay. Mite allergen concentrations for the child's mattress, the parents' mattress and the living room floor were moderately correlated between time-points. Agreement was better for cat allergen. For Der p 1 and Der f 1 on the child's mattress, the within-home variance was close to or smaller than the between-home variance in most cases. For Fel d 1, the within-home variance was almost always smaller than the between-home variance. Results were similar for allergen levels expressed per gram of dust and allergen levels expressed per square metre of the sampled surface. Variance ratios were smaller when samples were taken at shorter time intervals than at longer time intervals. Over a period of 4 years, mite and cat allergens measured in house dust are sufficiently stable to use single measurements with confidence in epidemiological studies. The within-home variance was larger when samples were taken 8 years apart so that over such long periods, repetition of sampling is recommended.
ERIC Educational Resources Information Center
Luh, Wei-Ming; Guo, Jiin-Huarng
2011-01-01
Sample size determination is an important issue in planning research. In the context of one-way fixed-effect analysis of variance, the conventional sample size formula cannot be applied for the heterogeneous variance cases. This study discusses the sample size requirement for the Welch test in the one-way fixed-effect analysis of variance with…
Errors in radial velocity variance from Doppler wind lidar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, H.; Barthelmie, R. J.; Doubrawa, P.
A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less
Errors in radial velocity variance from Doppler wind lidar
Wang, H.; Barthelmie, R. J.; Doubrawa, P.; ...
2016-08-29
A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less
Schroeder, A A; Ford, N L; Coil, J M
2017-03-01
To determine whether post space preparation deviated from the root canal preparation in canals filled with Thermafil, GuttaCore or warm vertically compacted gutta-percha. Forty-two extracted human permanent maxillary lateral incisors were decoronated, and their root canals instrumented using a standardized protocol. Samples were divided into three groups and filled with Thermafil (Dentsply Tulsa Dental Specialties, Johnson City, TN, USA), GuttaCore (Dentsply Tulsa Dental Specialties) or warm vertically compacted gutta-percha, before post space preparation was performed with a GT Post drill (Dentsply Tulsa Dental Specialties). Teeth were scanned using micro-computed tomography after root filling and again after post space preparation. Scans were examined for number of samples with post space deviation, linear deviation of post space preparation and minimum root thickness before and after post space preparation. Parametric data were analysed with one-way analysis of variance (anova) or one-tailed paired Student's t-tests, whilst nonparametric data were analysed with Fisher's exact test. Deviation occurred in eight of forty-two teeth (19%), seven of fourteen from the Thermafil group (50%), one of fourteen from the GuttaCore group (7%), and none from the gutta-percha group. Deviation occurred significantly more often in the Thermafil group than in each of the other two groups (P < 0.05). Linear deviation of post space preparation was greater in the Thermafil group than in both of the other groups and was significantly greater than that of the gutta-percha group (P < 0.05). Minimum root thickness before post space preparation was significantly greater than it was after post space preparation for all groups (P < 0.01). The differences between the Thermafil, GuttaCore and gutta-percha groups in the number of samples with post space deviation and in linear deviation of post space preparation were associated with the presence or absence of a carrier as well as the different carrier materials. © 2016 International Endodontic Journal. Published by John Wiley & Sons Ltd.
Berger, Philip; Messner, Michael J; Crosby, Jake; Vacs Renwick, Deborah; Heinrich, Austin
2018-05-01
Spore reduction can be used as a surrogate measure of Cryptosporidium natural filtration efficiency. Estimates of log10 (log) reduction were derived from spore measurements in paired surface and well water samples in Casper Wyoming and Kearney Nebraska. We found that these data were suitable for testing the hypothesis (H 0 ) that the average reduction at each site was 2 log or less, using a one-sided Student's t-test. After establishing data quality objectives for the test (expressed as tolerable Type I and Type II error rates), we evaluated the test's performance as a function of the (a) true log reduction, (b) number of paired samples assayed and (c) variance of observed log reductions. We found that 36 paired spore samples are sufficient to achieve the objectives over a wide range of variance, including the variances observed in the two data sets. We also explored the feasibility of using smaller numbers of paired spore samples to supplement bioparticle counts for screening purposes in alluvial aquifers, to differentiate wells with large volume surface water induced recharge from wells with negligible surface water induced recharge. With key assumptions, we propose a normal statistical test of the same hypothesis (H 0 ), but with different performance objectives. As few as six paired spore samples appear adequate as a screening metric to supplement bioparticle counts to differentiate wells in alluvial aquifers with large volume surface water induced recharge. For the case when all available information (including failure to reject H 0 based on the limited paired spore data) leads to the conclusion that wells have large surface water induced recharge, we recommend further evaluation using additional paired biweekly spore samples. Published by Elsevier GmbH.
NASA Astrophysics Data System (ADS)
Almosallam, Ibrahim A.; Jarvis, Matt J.; Roberts, Stephen J.
2016-10-01
The next generation of cosmology experiments will be required to use photometric redshifts rather than spectroscopic redshifts. Obtaining accurate and well-characterized photometric redshift distributions is therefore critical for Euclid, the Large Synoptic Survey Telescope and the Square Kilometre Array. However, determining accurate variance predictions alongside single point estimates is crucial, as they can be used to optimize the sample of galaxies for the specific experiment (e.g. weak lensing, baryon acoustic oscillations, supernovae), trading off between completeness and reliability in the galaxy sample. The various sources of uncertainty in measurements of the photometry and redshifts put a lower bound on the accuracy that any model can hope to achieve. The intrinsic uncertainty associated with estimates is often non-uniform and input-dependent, commonly known in statistics as heteroscedastic noise. However, existing approaches are susceptible to outliers and do not take into account variance induced by non-uniform data density and in most cases require manual tuning of many parameters. In this paper, we present a Bayesian machine learning approach that jointly optimizes the model with respect to both the predictive mean and variance we refer to as Gaussian processes for photometric redshifts (GPZ). The predictive variance of the model takes into account both the variance due to data density and photometric noise. Using the Sloan Digital Sky Survey (SDSS) DR12 data, we show that our approach substantially outperforms other machine learning methods for photo-z estimation and their associated variance, such as TPZ and ANNZ2. We provide a MATLAB and PYTHON implementations that are available to download at https://github.com/OxfordML/GPz.
The evaluation of alternate methodologies for land cover classification in an urbanizing area
NASA Technical Reports Server (NTRS)
Smekofski, R. M.
1981-01-01
The usefulness of LANDSAT in classifying land cover and in identifying and classifying land use change was investigated using an urbanizing area as the study area. The question of what was the best technique for classification was the primary focus of the study. The many computer-assisted techniques available to analyze LANDSAT data were evaluated. Techniques of statistical training (polygons from CRT, unsupervised clustering, polygons from digitizer and binary masks) were tested with minimum distance to the mean, maximum likelihood and canonical analysis with minimum distance to the mean classifiers. The twelve output images were compared to photointerpreted samples, ground verified samples and a current land use data base. Results indicate that for a reconnaissance inventory, the unsupervised training with canonical analysis-minimum distance classifier is the most efficient. If more detailed ground truth and ground verification is available, the polygons from the digitizer training with the canonical analysis minimum distance is more accurate.
Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique
2014-01-01
Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates.
Santin-Janin, Hugues; Hugueny, Bernard; Aubry, Philippe; Fouchet, David; Gimenez, Olivier; Pontier, Dominique
2014-01-01
Background Data collected to inform time variations in natural population size are tainted by sampling error. Ignoring sampling error in population dynamics models induces bias in parameter estimators, e.g., density-dependence. In particular, when sampling errors are independent among populations, the classical estimator of the synchrony strength (zero-lag correlation) is biased downward. However, this bias is rarely taken into account in synchrony studies although it may lead to overemphasizing the role of intrinsic factors (e.g., dispersal) with respect to extrinsic factors (the Moran effect) in generating population synchrony as well as to underestimating the extinction risk of a metapopulation. Methodology/Principal findings The aim of this paper was first to illustrate the extent of the bias that can be encountered in empirical studies when sampling error is neglected. Second, we presented a space-state modelling approach that explicitly accounts for sampling error when quantifying population synchrony. Third, we exemplify our approach with datasets for which sampling variance (i) has been previously estimated, and (ii) has to be jointly estimated with population synchrony. Finally, we compared our results to those of a standard approach neglecting sampling variance. We showed that ignoring sampling variance can mask a synchrony pattern whatever its true value and that the common practice of averaging few replicates of population size estimates poorly performed at decreasing the bias of the classical estimator of the synchrony strength. Conclusion/Significance The state-space model used in this study provides a flexible way of accurately quantifying the strength of synchrony patterns from most population size data encountered in field studies, including over-dispersed count data. We provided a user-friendly R-program and a tutorial example to encourage further studies aiming at quantifying the strength of population synchrony to account for uncertainty in population size estimates. PMID:24489839
Pain Behavior in Rheumatoid Arthritis Patients: Identification of Pain Behavior Subgroups
Waters, Sandra J.; Riordan, Paul A.; Keefe, Francis J.; Lefebvre, John C.
2008-01-01
This study used Ward’s minimum variance hierarchical cluster analysis to identify homogeneous subgroups of rheumatoid arthritis patients suffering from chronic pain who exhibited similar pain behavior patterns during a videotaped behavior sample. Ninety-two rheumatoid arthritis patients were divided into two samples. Six motor pain behaviors were examined: guarding, bracing, active rubbing, rigidity, grimacing, and sighing. The cluster analysis procedure identified four similar subgroups in Sample 1 and Sample 2. The first subgroup exhibited low levels of all pain behaviors. The second subgroup exhibited a high level of guarding and low levels of other pain behaviors. The third subgroup exhibited high levels of guarding and rigidity and low levels of other pain behaviors. The fourth subgroup exhibited high levels of guarding and active rubbing and low levels of other pain behaviors. Sample 1 contained a fifth subgroup that exhibited a high level of active rubbing and low levels of other pain measures. The results of this study suggest that there are homogeneous subgroups within rheumatoid arthritis patient populations who differ in the motor pain behaviors they exhibit. PMID:18358682
Visscher, Peter M; Goddard, Michael E
2015-01-01
Heritability is a population parameter of importance in evolution, plant and animal breeding, and human medical genetics. It can be estimated using pedigree designs and, more recently, using relationships estimated from markers. We derive the sampling variance of the estimate of heritability for a wide range of experimental designs, assuming that estimation is by maximum likelihood and that the resemblance between relatives is solely due to additive genetic variation. We show that well-known results for balanced designs are special cases of a more general unified framework. For pedigree designs, the sampling variance is inversely proportional to the variance of relationship in the pedigree and it is proportional to 1/N, whereas for population samples it is approximately proportional to 1/N(2), where N is the sample size. Variation in relatedness is a key parameter in the quantification of the sampling variance of heritability. Consequently, the sampling variance is high for populations with large recent effective population size (e.g., humans) because this causes low variation in relationship. However, even using human population samples, low sampling variance is possible with high N. Copyright © 2015 by the Genetics Society of America.
Sample size calculation for studies with grouped survival data.
Li, Zhiguo; Wang, Xiaofei; Wu, Yuan; Owzar, Kouros
2018-06-10
Grouped survival data arise often in studies where the disease status is assessed at regular visits to clinic. The time to the event of interest can only be determined to be between two adjacent visits or is right censored at one visit. In data analysis, replacing the survival time with the endpoint or midpoint of the grouping interval leads to biased estimators of the effect size in group comparisons. Prentice and Gloeckler developed a maximum likelihood estimator for the proportional hazards model with grouped survival data and the method has been widely applied. Previous work on sample size calculation for designing studies with grouped data is based on either the exponential distribution assumption or the approximation of variance under the alternative with variance under the null. Motivated by studies in HIV trials, cancer trials and in vitro experiments to study drug toxicity, we develop a sample size formula for studies with grouped survival endpoints that use the method of Prentice and Gloeckler for comparing two arms under the proportional hazards assumption. We do not impose any distributional assumptions, nor do we use any approximation of variance of the test statistic. The sample size formula only requires estimates of the hazard ratio and survival probabilities of the event time of interest and the censoring time at the endpoints of the grouping intervals for one of the two arms. The formula is shown to perform well in a simulation study and its application is illustrated in the three motivating examples. Copyright © 2018 John Wiley & Sons, Ltd.
Hansen, John P
2003-01-01
Healthcare quality improvement professionals need to understand and use inferential statistics to interpret sample data from their organizations. In quality improvement and healthcare research studies all the data from a population often are not available, so investigators take samples and make inferences about the population by using inferential statistics. This three-part series will give readers an understanding of the concepts of inferential statistics as well as the specific tools for calculating confidence intervals for samples of data. This article, Part 1, presents basic information about data including a classification system that describes the four major types of variables: continuous quantitative variable, discrete quantitative variable, ordinal categorical variable (including the binomial variable), and nominal categorical variable. A histogram is a graph that displays the frequency distribution for a continuous variable. The article also demonstrates how to calculate the mean, median, standard deviation, and variance for a continuous variable.
NASA Astrophysics Data System (ADS)
Bongale, Arunkumar M.; Kumar, Satish; Sachit, T. S.; Jadhav, Priya
2018-03-01
Studies on wear properties of Aluminium based hybrid nano composite materials, processed through powder metallurgy technique, are reported in the present study. Silicon Carbide nano particles and E-glass fibre are reinforced in pure aluminium matrix to fabricate hybrid nano composite material samples. Pin-on-Disc wear testing equipment is used to evaluate dry sliding wear properties of the composite samples. The tests were conducted following the Taguchi’s Design of Experiments method. Signal-to-Noise ratio analysis and Analysis of Variance are carried out on the test data to find out the influence of test parameters on the wear rate. Scanning Electron Microscopic analysis and Energy Dispersive x-ray analysis are conducted on the worn surfaces to find out the wear mechanisms responsible for wear of the composites. Multiple linear regression analysis and Genetic Algorithm techniques are employed for optimization of wear test parameters to yield minimum wear of the composite samples. Finally, a wear model is built by the application of Artificial Neural Networks to predict the wear rate of the composite material, under different testing conditions. The predicted values of wear rate are found to be very close to the experimental values with a deviation in the range of 0.15% to 8.09%.
Applying the Hájek Approach in Formula-Based Variance Estimation. Research Report. ETS RR-17-24
ERIC Educational Resources Information Center
Qian, Jiahe
2017-01-01
The variance formula derived for a two-stage sampling design without replacement employs the joint inclusion probabilities in the first-stage selection of clusters. One of the difficulties encountered in data analysis is the lack of information about such joint inclusion probabilities. One way to solve this issue is by applying Hájek's…
ERIC Educational Resources Information Center
Trumpower, David L.
2015-01-01
Making inferences about population differences based on samples of data, that is, performing intuitive analysis of variance (IANOVA), is common in everyday life. However, the intuitive reasoning of individuals when making such inferences (even following statistics instruction), often differs from the normative logic of formal statistics. The…
Rivera, D; Perrin, P B; Stevens, L F; Garza, M T; Weil, C; Saracho, C P; Rodríguez, W; Rodríguez-Agudelo, Y; Rábago, B; Weiler, G; García de la Cadena, C; Longoni, M; Martínez, C; Ocampo-Barba, N; Aliaga, A; Galarza-Del-Angel, J; Guerra, A; Esenarro, L; Arango-Lasprilla, J C
2015-01-01
To generate normative data on the Stroop Test across 11 countries in Latin America, with country-specific adjustments for gender, age, and education, where appropriate. The sample consisted of 3,977 healthy adults who were recruited from Argentina, Bolivia, Chile, Cuba, El Salvador, Guatemala, Honduras, Mexico, Paraguay, Peru, and, Puerto Rico. Each subject was administered the Stroop Test, as part of a larger neuropsychological battery. A standardized five-step statistical procedure was used to generate the norms. The final multiple linear regression models explained 14-36% of the variance in Stroop Word scores, 12-41% of the variance in the Stoop Color, 14-36% of the variance in the Stroop Word-Color scores, and 4-15% of variance in Stroop Interference scores. Although t-tests showed significant differences between men and women on the Stroop test, none of the countries had an effect size larger than 0.3. As a result, gender-adjusted norms were not generated. This is the first normative multicenter study conducted in Latin America to create norms for the Stoop Test in a Spanish-Speaking sample. This study will therefore have important implications for the future of neuropsychology research and practice throughout the region.
The use of NOAA AVHRR data for assessment of the urban heat sland effect
Gallo, K.P.; McNab, A. L.; Karl, Thomas R.; Brown, Jesslyn F.; Hood, J. J.; Tarpley, J.D.
1993-01-01
A vegetation index and a radiative surface temperature were derived from satellite data acquired at approximately 1330 LST for each of 37 cities and for their respective nearby rural regions from 28 June through 8 August 1991. Urbanrural differences for the vegetation index and the surface temperatures were computed and then compared to observed urbanrural differences in minimum air temperatures. The purpose of these comparisons was to evaluate the use of satellite data to assess the influence of the urban environment on observed minimum air temperatures (the urban heat island effect). The temporal consistency of the data, from daily data to weekly, biweekly, and monthly intervals, was also evaluated. The satellite-derived normalized difference (ND) vegetation-index data, sampled over urban and rural regions composed of a variety of land surface environments, were linearly related to the difference in observed urban and rural minimum temperatures. The relationship between the ND index and observed differences in minimum temperature was improved when analyses were restricted by elevation differences between the sample locations and when biweekly or monthly intervals were utilized. The difference in the ND index between urban and rural regions appears to be an indicator of the difference in surface properties (evaporation and heat storage capacity) between the two environments that are responsible for differences in urban and rural minimum temperatures. The urban and rural differences in the ND index explain a greater amount of the variation observed in minimum temperature differences than past analyses that utilized urban population data. The use of satellite data may contribute to a globally consistent method for analysis of urban heat island bias.
Minimum number of measurements for evaluating Bertholletia excelsa.
Baldoni, A B; Tonini, H; Tardin, F D; Botelho, S C C; Teodoro, P E
2017-09-27
Repeatability studies on fruit species are of great importance to identify the minimum number of measurements necessary to accurately select superior genotypes. This study aimed to identify the most efficient method to estimate the repeatability coefficient (r) and predict the minimum number of measurements needed for a more accurate evaluation of Brazil nut tree (Bertholletia excelsa) genotypes based on fruit yield. For this, we assessed the number of fruits and dry mass of seeds of 75 Brazil nut genotypes, from native forest, located in the municipality of Itaúba, MT, for 5 years. To better estimate r, four procedures were used: analysis of variance (ANOVA), principal component analysis based on the correlation matrix (CPCOR), principal component analysis based on the phenotypic variance and covariance matrix (CPCOV), and structural analysis based on the correlation matrix (mean r - AECOR). There was a significant effect of genotypes and measurements, which reveals the need to study the minimum number of measurements for selecting superior Brazil nut genotypes for a production increase. Estimates of r by ANOVA were lower than those observed with the principal component methodology and close to AECOR. The CPCOV methodology provided the highest estimate of r, which resulted in a lower number of measurements needed to identify superior Brazil nut genotypes for the number of fruits and dry mass of seeds. Based on this methodology, three measurements are necessary to predict the true value of the Brazil nut genotypes with a minimum accuracy of 85%.
Estimation of Variance in the Case of Complex Samples.
ERIC Educational Resources Information Center
Groenewald, A. C.; Stoker, D. J.
In a complex sampling scheme it is desirable to select the primary sampling units (PSUs) without replacement to prevent duplications in the sample. Since the estimation of the sampling variances is more complicated when the PSUs are selected without replacement, L. Kish (1965) recommends that the variance be calculated using the formulas…
A Comparison of the Fit of Empirical Data to Two Latent Trait Models. Report No. 92.
ERIC Educational Resources Information Center
Hutten, Leah R.
Goodness of fit of raw test score data were compared, using two latent trait models: the Rasch model and the Birnbaum three-parameter logistic model. Data were taken from various achievement tests and the Scholastic Aptitude Test (Verbal). A minimum sample size of 1,000 was required, and the minimum test length was 40 items. Results indicated that…
Isopycnal diffusivity in the tropical North Atlantic oxygen minimum zone
NASA Astrophysics Data System (ADS)
Köllner, Manuela; Visbeck, Martin; Tanhua, Toste; Fischer, Tim
2017-04-01
Isopycnal diffusivity plays an important role in the ventilation of the Eastern Tropical North Atlantic (ETNA) Oxygen Minimum Zone (OMZ). Lateral tracer transport is described by isopycnal diffusivity and mean advection of the tracer (e.g. oxygen), together they account for up to 70% of the oxygen supply for the OMZ. One of the big challenges is to separate diffusivity from advection. Isopycnal diffusivity was estimated to be Ky=(500 ± 200) m2 s-1 and Kx=(1200 ± 600) m2 s-1 by Banyte et. al (2013) from a Tracer Release Experiment (TRE). Hahn et al. (2014) estimated a meridional eddy diffusivity of 1350 m2 s-1 at 100 m depth decaying to less than 300 m2 s-1 below 800 m depth from repeated ship sections of CTD and ADCP data in addition with hydrographic mooring data. Uncertainties of the estimated diffusivities were still large, thus the Oxygen Supply Tracer Release Experiment (OSTRE) was set up to estimate isopycnal diffusivity in the OMZ using a newly developed sampling strategy of a control volume. The tracer was released in 2012 in the core of the OMZ at approximately 410 m depth and mapped after 6, 15 and 29 months in a regular grid. In addition to the calculation of tracer column integrals from vertical tracer profiles a new sampling method was invented and tested during two of the mapping cruises. The mean eddy diffusivity during OSTRE was found to be about (300 ± 130) m2 s-1. Additionally, the tracer has been advected further to the east and west by zonal jets. We compare different analysis methods to estimate isopycnal diffusivity from tracer spreading and show the advantage of the control volume surveys and control box approach. From the control box approach we are estimating the strength of the zonal jets within the OMZ core integrated over the TRE time period. References: Banyte, D., Visbeck, M., Tanhua, T., Fischer, T., Krahmann, G.,Karstensen, J., 2013. Lateral Diffusivity from Tracer Release Experiments in the Tropical North Atlantic Thermocline. Journal of Geophysical Research 118. Hahn, J., Brandt, P., Greatbatch, R., Krahmann, G., Körtzinger, A., 2014. Oxygen variance and meridional oxygen supply in the Tropical North East Atlantic oxygen minimum zone. Climate Dynamics 43, 2999-3024.
Excoffier, L; Smouse, P E; Quattro, J M
1992-06-01
We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as phi-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision. The method is flexible enough to accommodate several alternative input matrices, corresponding to different types of molecular data, as well as different types of evolutionary assumptions, without modifying the basic structure of the analysis. The significance of the variance components and phi-statistics is tested using a permutational approach, eliminating the normality assumption that is conventional for analysis of variance but inappropriate for molecular data. Application of AMOVA to human mitochondrial DNA haplotype data shows that population subdivisions are better resolved when some measure of molecular differences among haplotypes is introduced into the analysis. At the intraspecific level, however, the additional information provided by knowing the exact phylogenetic relations among haplotypes or by a nonlinear translation of restriction-site change into nucleotide diversity does not significantly modify the inferred population genetic structure. Monte Carlo studies show that site sampling does not fundamentally affect the significance of the molecular variance components. The AMOVA treatment is easily extended in several different directions and it constitutes a coherent and flexible framework for the statistical analysis of molecular data.
Credit Building in IDA Programs: Early Findings of a Longitudinal Study
ERIC Educational Resources Information Center
Birkenmaier, Julie; Curley, Jami; Kelly, Patrick
2012-01-01
Objective: This article reports on the impact of the Individual Development Account (IDA) program on credit. Method: Using a convenience sample of IDA participants (N = 165), data were analyzed using paired sample "t" tests, independent sample "t" test, one-way analysis of variance, Mann-Whitney "U" Tests, and…
Robust versus consistent variance estimators in marginal structural Cox models.
Enders, Dirk; Engel, Susanne; Linder, Roland; Pigeot, Iris
2018-06-11
In survival analyses, inverse-probability-of-treatment (IPT) and inverse-probability-of-censoring (IPC) weighted estimators of parameters in marginal structural Cox models are often used to estimate treatment effects in the presence of time-dependent confounding and censoring. In most applications, a robust variance estimator of the IPT and IPC weighted estimator is calculated leading to conservative confidence intervals. This estimator assumes that the weights are known rather than estimated from the data. Although a consistent estimator of the asymptotic variance of the IPT and IPC weighted estimator is generally available, applications and thus information on the performance of the consistent estimator are lacking. Reasons might be a cumbersome implementation in statistical software, which is further complicated by missing details on the variance formula. In this paper, we therefore provide a detailed derivation of the variance of the asymptotic distribution of the IPT and IPC weighted estimator and explicitly state the necessary terms to calculate a consistent estimator of this variance. We compare the performance of the robust and consistent variance estimators in an application based on routine health care data and in a simulation study. The simulation reveals no substantial differences between the 2 estimators in medium and large data sets with no unmeasured confounding, but the consistent variance estimator performs poorly in small samples or under unmeasured confounding, if the number of confounders is large. We thus conclude that the robust estimator is more appropriate for all practical purposes. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Sigro, J.; Brunet, M.; Aguilar, E.; Stoll, H.; Jimenez, M.
2009-04-01
The Spanish-funded research project Rapid Climate Changes in the Iberian Peninsula (IP) Based on Proxy Calibration, Long Term Instrumental Series and High Resolution Analyses of Terrestrial and Marine Records (CALIBRE: ref. CGL2006-13327-C04/CLI) has as main objective to analyse climate dynamics during periods of rapid climate change by means of developing high-resolution paleoclimate proxy records from marine and terrestrial (lakes and caves) deposits over the IP and calibrating them with long-term and high-quality instrumental climate time series. Under CALIBRE, the coordinated project Developing and Enhancing a Climate Instrumental Dataset for Calibrating Climate Proxy Data and Analysing Low-Frequency Climate Variability over the Iberian Peninsula (CLICAL: CGL2006-13327-C04-03/CLI) is devoted to the development of homogenised climate records and sub-regional time series which can be confidently used in the calibration of the lacustrine, marine and speleothem time series generated under CALIBRE. Here we present the procedures followed in order to homogenise a dataset of maximum and minimum temperature and precipitation data on a monthly basis over the Spanish northern coast. The dataset is composed of thirty (twenty) precipitation (temperature) long monthly records. The data are quality controlled following the procedures recommended by Aguilar et al. (2003) and tested for homogeneity and adjusted by following the approach adopted by Brunet et al. (2008). Sub-regional time series of precipitation, maximum and minimum temperatures for the period 1853-2007 have been generated by averaging monthly anomalies and then adding back the base-period mean, according to the method of Jones and Hulme (1996). Also, a method to adjust the variance bias present in regional time series associated over time with varying sample size has been applied (Osborn et al., 1997). The results of this homogenisation exercise and the development of the associated sub-regional time series will be widely discussed. Initial comparisons with rapidly growing speleothems in two different caves indicate that speleothem trace element ratios like Ba/Ca are recording the decrease in littoral precipitation in the last several decades. References Aguilar, E., Auer, I., Brunet, M., Peterson, T. C. and Weringa, J. 2003. Guidelines on Climate Metadata and Homogenization, World Meteorological Organization (WMO)-TD no. 1186 / World Climate Data and Monitoring Program (WCDMP) no. 53, Geneva: 51 pp. Brunet M, Saladié O, Jones P, Sigró J, Aguilar E, Moberg A, Lister D, Walther A, Almarza C. 2008. A case-study/guidance on the development of long-term daily adjusted temperature datasets, WMO-TD-1425/WCDMP-66, Geneva: 43 pp. Jones, P D, and Hulme M, 1996, Calculating regional climatic time series for temperature and precipitation: Methods and illustrations, Int. J. Climatol., 16, 361- 377. Osborn, T. J., Briffa K. R., and Jones P. D., 1997, Adjusting variance for sample-size in tree-ring chronologies and other regional mean time series, Dendrochronologia, 15, 89- 99.
Bed load transport over a broad range of timescales: Determination of three regimes of fluctuations
NASA Astrophysics Data System (ADS)
Ma, Hongbo; Heyman, Joris; Fu, Xudong; Mettra, Francois; Ancey, Christophe; Parker, Gary
2014-12-01
This paper describes the relationship between the statistics of bed load transport flux and the timescale over which it is sampled. A stochastic formulation is developed for the probability distribution function of bed load transport flux, based on the Ancey et al. (2008) theory. An analytical solution for the variance of bed load transport flux over differing sampling timescales is presented. The solution demonstrates that the timescale dependence of the variance of bed load transport flux reduces to a three-regime relation demarcated by an intermittency timescale (tI) and a memory timescale (tc). As the sampling timescale increases, this variance passes through an intermittent stage (≪tI), an invariant stage (tI < t < tc), and a memoryless stage (≫ tc). We propose a dimensionless number (Ra) to represent the relative strength of fluctuation, which provides a common ground for comparison of fluctuation strength among different experiments, as well as different sampling timescales for each experiment. Our analysis indicates that correlated motion and the discrete nature of bed load particles are responsible for this three-regime behavior. We use the data from three experiments with high temporal resolution of bed load transport flux to validate the proposed three-regime behavior. The theoretical solution for the variance agrees well with all three sets of experimental data. Our findings contribute to the understanding of the observed fluctuations of bed load transport flux over monosize/multiple-size grain beds, to the characterization of an inherent connection between short-term measurements and long-term statistics, and to the design of appropriate sampling strategies for bed load transport flux.
Herrera, Carlos M
2012-01-01
Methods for estimating quantitative trait heritability in wild populations have been developed in recent years which take advantage of the increased availability of genetic markers to reconstruct pedigrees or estimate relatedness between individuals, but their application to real-world data is not exempt from difficulties. This chapter describes a recent marker-based technique which, by adopting a genomic scan approach and focusing on the relationship between phenotypes and genotypes at the individual level, avoids the problems inherent to marker-based estimators of relatedness. This method allows the quantification of the genetic component of phenotypic variance ("degree of genetic determination" or "heritability in the broad sense") in wild populations and is applicable whenever phenotypic trait values and multilocus data for a large number of genetic markers (e.g., amplified fragment length polymorphisms, AFLPs) are simultaneously available for a sample of individuals from the same population. The method proceeds by first identifying those markers whose variation across individuals is significantly correlated with individual phenotypic differences ("adaptive loci"). The proportion of phenotypic variance in the sample that is statistically accounted for by individual differences in adaptive loci is then estimated by fitting a linear model to the data, with trait value as the dependent variable and scores of adaptive loci as independent ones. The method can be easily extended to accommodate quantitative or qualitative information on biologically relevant features of the environment experienced by each sampled individual, in which case estimates of the environmental and genotype × environment components of phenotypic variance can also be obtained.
The relationship between seasonal mood change and personality: more apparent than real?
Jang, K L; Lam, R W; Livesley, W J; Vernon, P A
1997-06-01
A number of recent research reports have reported significant relationships between seasonal mood change (seasonality) and personality. However, some of the results are difficult to interpret because of inherent methodological problems, the most important of which is the use of samples drawn from the southern as opposed to the northern hemisphere, where the phenomenon of seasonality may be quite different. The present study examined the relationship between personality and seasonality in a sample from the northern hemisphere (minimum latitude = 49 degrees N). A total of 297 adults drawn from the general population (112 male and 185 female subjects) completed the Seasonal Pattern Assessment Questionnaire, and the results obtained confirmed most of the previously reported relationships and showed that these are reliable across (i) different hemispheres, (ii) different measures of personality and (iii) clinical and general population samples. However, the impact of the relationship seems to be more apparent than real, with personality accounting for just under 15% of the total variance.
Stream-temperature patterns of the Muddy Creek basin, Anne Arundel County, Maryland
Pluhowski, E.J.
1981-01-01
Using a water-balance equation based on a 4.25-year gaging-station record on North Fork Muddy Creek, the following mean annual values were obtained for the Muddy Creek basin: precipitation, 49.0 inches; evapotranspiration, 28.0 inches; runoff, 18.5 inches; and underflow, 2.5 inches. Average freshwater outflow from the Muddy Creek basin to the Rhode River estuary was 12.2 cfs during the period October 1, 1971, to December 31, 1975. Harmonic equations were used to describe seasonal maximum and minimum stream-temperature patterns at 12 sites in the basin. These equations were fitted to continuous water-temperature data obtained periodically at each site between November 1970 and June 1978. The harmonic equations explain at least 78 percent of the variance in maximum stream temperatures and 81 percent of the variance in minimum temperatures. Standard errors of estimate averaged 2.3C (Celsius) for daily maximum water temperatures and 2.1C for daily minimum temperatures. Mean annual water temperatures developed for a 5.4-year base period ranged from 11.9C at Muddy Creek to 13.1C at Many Fork Branch. The largest variations in stream temperatures were detected at thermograph sites below ponded reaches and where forest coverage was sparse or missing. At most sites the largest variations in daily water temperatures were recorded in April whereas the smallest were in September and October. The low thermal inertia of streams in the Muddy Creek basin tends to amplify the impact of surface energy-exchange processes on short-period stream-temperature patterns. Thus, in response to meteorologic events, wide ranging stream-temperature perturbations of as much as 6C have been documented in the basin. (USGS)
On the error in crop acreage estimation using satellite (LANDSAT) data
NASA Technical Reports Server (NTRS)
Chhikara, R. (Principal Investigator)
1983-01-01
The problem of crop acreage estimation using satellite data is discussed. Bias and variance of a crop proportion estimate in an area segment obtained from the classification of its multispectral sensor data are derived as functions of the means, variances, and covariance of error rates. The linear discriminant analysis and the class proportion estimation for the two class case are extended to include a third class of measurement units, where these units are mixed on ground. Special attention is given to the investigation of mislabeling in training samples and its effect on crop proportion estimation. It is shown that the bias and variance of the estimate of a specific crop acreage proportion increase as the disparity in mislabeling rates between two classes increases. Some interaction is shown to take place, causing the bias and the variance to decrease at first and then to increase, as the mixed unit class varies in size from 0 to 50 percent of the total area segment.
Hassenbusch, S J; Colvin, O M; Anderson, J H
1995-07-01
A relatively simple, high-sensitivity gas chromatographic assay is described for nitrosourea compounds, such as BCNU [1,3-bis(2-chloroethyl)-1-nitrosourea] and MeCCNU [1-(2-chloroethyl)-3-(trans-4-methylcyclohexyl)-1-nitrosourea], in small biopsy samples of brain and other tissues. After extraction with ethyl acetate, secondary amines in BCNU and MeCCNU are derivatized with trifluoroacetic anhydride. Compounds are separated and quantitated by gas chromatography using a capillary column with temperature programming and an electron capture detector. Standard curves of BCNU indicate a coefficient of variance of 0.066 +/- 0.018, a correlation coefficient of 0.929, and an extraction efficiency from whole brain of 68% with a minimum detectable amount of 20 ng in 5-10 mg samples. The assay has been facile and sensitive in over 1000 brain biopsy specimens after intravenous and intraarterial infusions of BCNU.
Ionic strength and DOC determinations from various freshwater sources to the San Francisco Bay
Hunter, Y.R.; Kuwabara, J.S.
1994-01-01
An exact estimation of dissolved organic carbon (DOC) within the salinity gradient of zinc and copper metals is significant in understanding the limit to which DOC could influence metal speciation. A low-temperature persulfate/oxygen/ultraviolet wet oxidation procedure was utilized for analyzing DOC samples adapted for ionic strength from major freshwater sources of the northern and southern regions of San Francisco Bay. The ionic strength of samples was modified with a chemically defined seawater medium up to 0.7M. Based on the results, a minimum effect of ionic strength on oxidation proficiency for DOC sources to the Bay over an ionic strength gradient of 0.0 to 0.7 M was observed. There was no major impacts of ionic strength on two Suwanee River fulvic acids. In general, the noted effects associated with ionic strength were smaller than the variances seen in the aquatic environment between high- and low-temperature methods.
Liu, Xiaofeng Steven
2011-05-01
The use of covariates is commonly believed to reduce the unexplained error variance and the standard error for the comparison of treatment means, but the reduction in the standard error is neither guaranteed nor uniform over different sample sizes. The covariate mean differences between the treatment conditions can inflate the standard error of the covariate-adjusted mean difference and can actually produce a larger standard error for the adjusted mean difference than that for the unadjusted mean difference. When the covariate observations are conceived of as randomly varying from one study to another, the covariate mean differences can be related to a Hotelling's T(2) . Using this Hotelling's T(2) statistic, one can always find a minimum sample size to achieve a high probability of reducing the standard error and confidence interval width for the adjusted mean difference. ©2010 The British Psychological Society.
NASA Technical Reports Server (NTRS)
Melbourne, William G.
1986-01-01
In double differencing a regression system obtained from concurrent Global Positioning System (GPS) observation sequences, one either undersamples the system to avoid introducing colored measurement statistics, or one fully samples the system incurring the resulting non-diagonal covariance matrix for the differenced measurement errors. A suboptimal estimation result will be obtained in the undersampling case and will also be obtained in the fully sampled case unless the color noise statistics are taken into account. The latter approach requires a least squares weighting matrix derived from inversion of a non-diagonal covariance matrix for the differenced measurement errors instead of inversion of the customary diagonal one associated with white noise processes. Presented is the so-called fully redundant double differencing algorithm for generating a weighted double differenced regression system that yields equivalent estimation results, but features for certain cases a diagonal weighting matrix even though the differenced measurement error statistics are highly colored.
The Minimum Wage and the Employment of Teenagers. Recent Research.
ERIC Educational Resources Information Center
Fallick, Bruce; Currie, Janet
A study used individual-level data from the National Longitudinal Study of Youth to examine the effects of changes in the federal minimum wage on teenage employment. Individuals in the sample were classified as either likely or unlikely to be affected by these increases in the federal minimum wage on the basis of their wage rates and industry of…
Estimating the encounter rate variance in distance sampling
Fewster, R.M.; Buckland, S.T.; Burnham, K.P.; Borchers, D.L.; Jupp, P.E.; Laake, J.L.; Thomas, L.
2009-01-01
The dominant source of variance in line transect sampling is usually the encounter rate variance. Systematic survey designs are often used to reduce the true variability among different realizations of the design, but estimating the variance is difficult and estimators typically approximate the variance by treating the design as a simple random sample of lines. We explore the properties of different encounter rate variance estimators under random and systematic designs. We show that a design-based variance estimator improves upon the model-based estimator of Buckland et al. (2001, Introduction to Distance Sampling. Oxford: Oxford University Press, p. 79) when transects are positioned at random. However, if populations exhibit strong spatial trends, both estimators can have substantial positive bias under systematic designs. We show that poststratification is effective in reducing this bias. ?? 2008, The International Biometric Society.
Brooks, M.H.; Schroder, L.J.; Malo, B.A.
1985-01-01
Four laboratories were evaluated in their analysis of identical natural and simulated precipitation water samples. Interlaboratory comparability was evaluated using analysis of variance coupled with Duncan 's multiple range test, and linear-regression models describing the relations between individual laboratory analytical results for natural precipitation samples. Results of the statistical analyses indicate that certain pairs of laboratories produce different results when analyzing identical samples. Analyte bias for each laboratory was examined using analysis of variance coupled with Duncan 's multiple range test on data produced by the laboratories from the analysis of identical simulated precipitation samples. Bias for a given analyte produced by a single laboratory has been indicated when the laboratory mean for that analyte is shown to be significantly different from the mean for the most-probable analyte concentrations in the simulated precipitation samples. Ion-chromatographic methods for the determination of chloride, nitrate, and sulfate have been compared with the colorimetric methods that were also in use during the study period. Comparisons were made using analysis of variance coupled with Duncan 's multiple range test for means produced by the two methods. Analyte precision for each laboratory has been estimated by calculating a pooled variance for each analyte. Analyte estimated precisions have been compared using F-tests and differences in analyte precisions for laboratory pairs have been reported. (USGS)
Lehrer, Paul; Karavidas, Maria; Lu, Shou-En; Vaschillo, Evgeny; Vaschillo, Bronya; Cheng, Andrew
2010-05-01
Seven professional airplane pilots participated in a one-session test in a Boeing 737-800 simulator. Mental workload for 18 flight tasks was rated by experienced test pilots (hereinafter called "expert ratings") and by study participants' self-report on NASA's Task Load Index (TLX) scale. Pilot performance was rated by a check pilot. The standard deviation of R-R intervals (SDNN) significantly added 3.7% improvement over the TLX in distinguishing high from moderate-load tasks and 2.3% improvement in distinguishing high from combined moderate and low-load tasks. Minimum RRI in the task significantly discriminated high- from medium- and low-load tasks, but did not add significant predictive variance to the TLX. The low-frequency/high-frequency (LF:HF) RRI ratio based on spectral analysis of R-R intervals, and ventricular relaxation time were each negatively related to pilot performance ratings independently of TLX values, while minimum and average RRI were positively related, showing added contribution of these cardiac measures for predicting performance. Cardiac results were not affected by controlling either for respiration rate or motor activity assessed by accelerometry. The results suggest that cardiac assessment can be a useful addition to self-report measures for determining flight task mental workload and risk for performance decrements. Replication on a larger sample is needed to confirm and extend the results. Copyright 2010 Elsevier B.V. All rights reserved.
Pesticide data for selected Wyoming streams, 1976-78
Butler, David L.
1987-01-01
In 1976, the U.S. Geological Survey, in cooperation with the Wyoming Department of Agriculture, started a monitoring program to determine pesticide concentrations in Wyoming streams. This program was incorporated into the water-quality data-collection system already in operation. Samples were collected at 20 sites for analysis of various insecticides, herbicides, polychlorinated biphenyls, and polychlorinated napthalenes.\\The results through 1978 revealed small concentrations of pesticides in water and bottom-material samples were DDE (39 percent of the concentrations equal to or greater than the minimum reported concentrations of the analytical methods), DDD (20 percent), dieldrin (21 percent), and polychlorinated biphenyls (29 percent). The herbicides most commonly found in water samples were 2,4-D (29 percent of the concentrations equal to or greater than the minimum reported concentrations of the analytical method) and picloram (23 percent). Most concentrations were significantly less than concentrations thought to be harmful to freshwater aquatic life based on available toxicity data. However for some pesticides, U.S. Environmental Protection Agency water-quality criteria for freshwater aquatic life are based on bioaccumulation factors that result in criteria concentrations less than the minimum reported concentrations of the analytical methods. It is not known if certain pesticides were present at concentrations less than the minimum reported concentrations that exceeded these criteria.
Ozay, Guner; Seyhan, Ferda; Yilmaz, Aysun; Whitaker, Thomas B; Slate, Andrew B; Giesbrecht, Francis
2006-01-01
The variability associated with the aflatoxin test procedure used to estimate aflatoxin levels in bulk shipments of hazelnuts was investigated. Sixteen 10 kg samples of shelled hazelnuts were taken from each of 20 lots that were suspected of aflatoxin contamination. The total variance associated with testing shelled hazelnuts was estimated and partitioned into sampling, sample preparation, and analytical variance components. Each variance component increased as aflatoxin concentration (either B1 or total) increased. With the use of regression analysis, mathematical expressions were developed to model the relationship between aflatoxin concentration and the total, sampling, sample preparation, and analytical variances. The expressions for these relationships were used to estimate the variance for any sample size, subsample size, and number of analyses for a specific aflatoxin concentration. The sampling, sample preparation, and analytical variances associated with estimating aflatoxin in a hazelnut lot at a total aflatoxin level of 10 ng/g and using a 10 kg sample, a 50 g subsample, dry comminution with a Robot Coupe mill, and a high-performance liquid chromatographic analytical method are 174.40, 0.74, and 0.27, respectively. The sampling, sample preparation, and analytical steps of the aflatoxin test procedure accounted for 99.4, 0.4, and 0.2% of the total variability, respectively.
Demographics of an ornate box turtle population experiencing minimal human-induced disturbances
Converse, S.J.; Iverson, J.B.; Savidge, J.A.
2005-01-01
Human-induced disturbances may threaten the viability of many turtle populations, including populations of North American box turtles. Evaluation of the potential impacts of these disturbances can be aided by long-term studies of populations subject to minimal human activity. In such a population of ornate box turtles (Terrapene ornata ornata) in western Nebraska, we examined survival rates and population growth rates from 1981-2000 based on mark-recapture data. The average annual apparent survival rate of adult males was 0.883 (SE = 0.021) and of adult females was 0.932 (SE = 0.014). Minimum winter temperature was the best of five climate variables as a predictor of adult survival. Survival rates were highest in years with low minimum winter temperatures, suggesting that global warming may result in declining survival. We estimated an average adult population growth rate (????) of 1.006 (SE = 0.065), with an estimated temporal process variance (????2) of 0.029 (95% CI = 0.005-0.176). Stochastic simulations suggest that this mean and temporal process variance would result in a 58% probability of a population decrease over a 20-year period. This research provides evidence that, unless unknown density-dependent mechanisms are operating in the adult age class, significant human disturbances, such as commercial harvest or turtle mortality on roads, represent a potential risk to box turtle populations. ?? 2005 by the Ecological Society of America.
Perdikaris, Paris; Karniadakis, George Em
2016-05-01
We present a computational framework for model inversion based on multi-fidelity information fusion and Bayesian optimization. The proposed methodology targets the accurate construction of response surfaces in parameter space, and the efficient pursuit to identify global optima while keeping the number of expensive function evaluations at a minimum. We train families of correlated surrogates on available data using Gaussian processes and auto-regressive stochastic schemes, and exploit the resulting predictive posterior distributions within a Bayesian optimization setting. This enables a smart adaptive sampling procedure that uses the predictive posterior variance to balance the exploration versus exploitation trade-off, and is a key enabler for practical computations under limited budgets. The effectiveness of the proposed framework is tested on three parameter estimation problems. The first two involve the calibration of outflow boundary conditions of blood flow simulations in arterial bifurcations using multi-fidelity realizations of one- and three-dimensional models, whereas the last one aims to identify the forcing term that generated a particular solution to an elliptic partial differential equation. © 2016 The Author(s).
Perdikaris, Paris; Karniadakis, George Em
2016-01-01
We present a computational framework for model inversion based on multi-fidelity information fusion and Bayesian optimization. The proposed methodology targets the accurate construction of response surfaces in parameter space, and the efficient pursuit to identify global optima while keeping the number of expensive function evaluations at a minimum. We train families of correlated surrogates on available data using Gaussian processes and auto-regressive stochastic schemes, and exploit the resulting predictive posterior distributions within a Bayesian optimization setting. This enables a smart adaptive sampling procedure that uses the predictive posterior variance to balance the exploration versus exploitation trade-off, and is a key enabler for practical computations under limited budgets. The effectiveness of the proposed framework is tested on three parameter estimation problems. The first two involve the calibration of outflow boundary conditions of blood flow simulations in arterial bifurcations using multi-fidelity realizations of one- and three-dimensional models, whereas the last one aims to identify the forcing term that generated a particular solution to an elliptic partial differential equation. PMID:27194481
Use of the Internet for Health Information: United States, 2009
... as accidents or dental care. Data source and methods Data from the 2009 NHIS were used for ... sample design of NHIS. The Taylor series linearization method was chosen for variance estimation. Differences between percentages ...
Strategies Used by Adults to Reduce Their Prescription Drug Costs
... on their 2010 income ( 5 ). Data source and methods Data from the 2011 NHIS were used for ... sample design of NHIS. The Taylor series linearization method was chosen for variance estimation. All estimates shown ...
Moss, Marshall E.; Gilroy, Edward J.
1980-01-01
This report describes the theoretical developments and illustrates the applications of techniques that recently have been assembled to analyze the cost-effectiveness of federally funded stream-gaging activities in support of the Colorado River compact and subsequent adjudications. The cost effectiveness of 19 stream gages in terms of minimizing the sum of the variances of the errors of estimation of annual mean discharge is explored by means of a sequential-search optimization scheme. The search is conducted over a set of decision variables that describes the number of times that each gaging route is traveled in a year. A gage route is defined as the most expeditious circuit that is made from a field office to visit one or more stream gages and return to the office. The error variance is defined as a function of the frequency of visits to a gage by using optimal estimation theory. Currently a minimum of 12 visits per year is made to any gage. By changing to a six-visit minimum, the same total error variance can be attained for the 19 stations with a budget of 10% less than the current one. Other strategies are also explored. (USGS)
River meanders - Theory of minimum variance
Langbein, Walter Basil; Leopold, Luna Bergere
1966-01-01
Meanders are the result of erosion-deposition processes tending toward the most stable form in which the variability of certain essential properties is minimized. This minimization involves the adjustment of the planimetric geometry and the hydraulic factors of depth, velocity, and local slope.The planimetric geometry of a meander is that of a random walk whose most frequent form minimizes the sum of the squares of the changes in direction in each successive unit length. The direction angles are then sine functions of channel distance. This yields a meander shape typically present in meandering rivers and has the characteristic that the ratio of meander length to average radius of curvature in the bend is 4.7.Depth, velocity, and slope are shown by field observations to be adjusted so as to decrease the variance of shear and the friction factor in a meander curve over that in an otherwise comparable straight reach of the same riverSince theory and observation indicate meanders achieve the minimum variance postulated, it follows that for channels in which alternating pools and riffles occur, meandering is the most probable form of channel geometry and thus is more stable geometry than a straight or nonmeandering alinement.
Point focusing using loudspeaker arrays from the perspective of optimal beamforming.
Bai, Mingsian R; Hsieh, Yu-Hao
2015-06-01
Sound focusing is to create a concentrated acoustic field in the region surrounded by a loudspeaker array. This problem was tackled in the previous research via the Helmholtz integral approach, brightness control, acoustic contrast control, etc. In this paper, the same problem was revisited from the perspective of beamforming. A source array model is reformulated in terms of the steering matrix between the source and the field points, which lends itself to the use of beamforming algorithms such as minimum variance distortionless response (MVDR) and linearly constrained minimum variance (LCMV) originally intended for sensor arrays. The beamforming methods are compared with the conventional methods in terms of beam pattern, directional index, and control effort. Objective tests are conducted to assess the audio quality by using perceptual evaluation of audio quality (PEAQ). Experiments of produced sound field and listening tests are conducted in a listening room, with results processed using analysis of variance and regression analysis. In contrast to the conventional energy-based methods, the results have shown that the proposed methods are phase-sensitive in light of the distortionless constraint in formulating the array filters, which helps enhance audio quality and focusing performance.
Optimization of data analysis for the in vivo neutron activation analysis of aluminum in bone.
Mohseni, H K; Matysiak, W; Chettle, D R; Byun, S H; Priest, N; Atanackovic, J; Prestwich, W V
2016-10-01
An existing system at McMaster University has been used for the in vivo measurement of aluminum in human bone. Precise and detailed analysis approaches are necessary to determine the aluminum concentration because of the low levels of aluminum found in the bone and the challenges associated with its detection. Phantoms resembling the composition of the human hand with varying concentrations of aluminum were made for testing the system prior to the application to human studies. A spectral decomposition model and a photopeak fitting model involving the inverse-variance weighted mean and a time-dependent analysis were explored to analyze the results and determine the model with the best performance and lowest minimum detection limit. The results showed that the spectral decomposition and the photopeak fitting model with the inverse-variance weighted mean both provided better results compared to the other methods tested. The spectral decomposition method resulted in a marginally lower detection limit (5μg Al/g Ca) compared to the inverse-variance weighted mean (5.2μg Al/g Ca), rendering both equally applicable to human measurements. Copyright © 2016 Elsevier Ltd. All rights reserved.
Sadoul, Bastien C; Schuring, Ewoud A H; Mela, David J; Peters, Harry P F
2014-12-01
Several studies have assessed relationships of self-reported appetite (eating motivations, mainly by Visual Analogue Scales, VAS) with subsequent energy intake (EI), though usually in small data sets with limited power and variable designs. The objectives were therefore to better quantify the relationships of self-reports (incorporating subject characteristics) to subsequent EI, and to estimate the quantitative differences in VAS corresponding to consistent, significant differences in EI. Data were derived from an opportunity sample of 23 randomized controlled studies involving 549 subjects, testing the effects of various food ingredients in meal replacers or 100-150 ml mini-drinks. In all studies, scores on several VAS were recorded for 30 min to 5 h post-meal, when EI was assessed by ad libitum meal consumption. The relationships between pre-meal VAS scores and EI were examined using correlation, linear models (including subject characteristics) and a cross-validation procedure. VAS correlations with subsequent EI were statistically significant, but of low magnitude, up to r = 0.26. Hunger, age, gender, body weight and estimated basal metabolic rate explained 25% of the total variance in EI. Without hunger the prediction of EI was modestly but significantly lower (19%, P < 0.001). A change of ≥15-25 mm on a 100 mm VAS was the minimum effect consistently corresponding to a significant change in subsequent EI, depending on the starting VAS level. Eating motivations add in a small but consistently significant way to other known predictors of acute EI. Differences of about 15 mm on a 100 mm VAS appear to be the minimum effect expected to result in consistent, significant differences in subsequent EI. Copyright © 2014 Elsevier Ltd. All rights reserved.
Object aggregation using Neyman-Pearson analysis
NASA Astrophysics Data System (ADS)
Bai, Li; Hinman, Michael L.
2003-04-01
This paper presents a novel approach to: 1) distinguish military vehicle groups, and 2) identify names of military vehicle convoys in the level-2 fusion process. The data is generated from a generic Ground Moving Target Indication (GMTI) simulator that utilizes Matlab and Microsoft Access. This data is processed to identify the convoys and number of vehicles in the convoy, using the minimum timed distance variance (MTDV) measurement. Once the vehicle groups are formed, convoy association is done using hypothesis techniques based upon Neyman Pearson (NP) criterion. One characteristic of NP is the low error probability when a-priori information is unknown. The NP approach was demonstrated with this advantage over a Bayesian technique.
NASA Astrophysics Data System (ADS)
Mozaffarzadeh, Moein; Mahloojifar, Ali; Nasiriavanaki, Mohammadreza; Orooji, Mahdi
2018-02-01
Delay and sum (DAS) is the most common beamforming algorithm in linear-array photoacoustic imaging (PAI) as a result of its simple implementation. However, it leads to a low resolution and high sidelobes. Delay multiply and sum (DMAS) was used to address the incapabilities of DAS, providing a higher image quality. However, the resolution improvement is not well enough compared to eigenspace-based minimum variance (EIBMV). In this paper, the EIBMV beamformer has been combined with DMAS algebra, called EIBMV-DMAS, using the expansion of DMAS algorithm. The proposed method is used as the reconstruction algorithm in linear-array PAI. EIBMV-DMAS is experimentally evaluated where the quantitative and qualitative results show that it outperforms DAS, DMAS and EIBMV. The proposed method degrades the sidelobes for about 365 %, 221 % and 40 %, compared to DAS, DMAS and EIBMV, respectively. Moreover, EIBMV-DMAS improves the SNR about 158 %, 63 % and 20 %, respectively.
Xiao, Mengli; Zhang, Yongbo; Fu, Huimin; Wang, Zhihua
2018-05-01
High-precision navigation algorithm is essential for the future Mars pinpoint landing mission. The unknown inputs caused by large uncertainties of atmospheric density and aerodynamic coefficients as well as unknown measurement biases may cause large estimation errors of conventional Kalman filters. This paper proposes a derivative-free version of nonlinear unbiased minimum variance filter for Mars entry navigation. This filter has been designed to solve this problem by estimating the state and unknown measurement biases simultaneously with derivative-free character, leading to a high-precision algorithm for the Mars entry navigation. IMU/radio beacons integrated navigation is introduced in the simulation, and the result shows that with or without radio blackout, our proposed filter could achieve an accurate state estimation, much better than the conventional unscented Kalman filter, showing the ability of high-precision Mars entry navigation algorithm. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Sohn, G.; Jung, J.; Jwa, Y.; Armenakis, C.
2013-05-01
This paper presents a sequential rooftop modelling method to refine initial rooftop models derived from airborne LiDAR data by integrating it with linear cues retrieved from single imagery. A cue integration between two datasets is facilitated by creating new topological features connecting between the initial model and image lines, with which new model hypotheses (variances to the initial model) are produced. We adopt Minimum Description Length (MDL) principle for competing the model candidates and selecting the optimal model by considering the balanced trade-off between the model closeness and the model complexity. Our preliminary results, combined with the Vaihingen data provided by ISPRS WGIII/4 demonstrate the image-driven modelling cues can compensate the limitations posed by LiDAR data in rooftop modelling.
Ichthyoplankton abundance and variance in a large river system concerns for long-term monitoring
Holland-Bartels, Leslie E.; Dewey, Michael R.; Zigler, Steven J.
1995-01-01
System-wide spatial patterns of ichthyoplankton abundance and variability were assessed in the upper Mississippi and lower Illinois rivers to address the experimental design and statistical confidence in density estimates. Ichthyoplankton was sampled from June to August 1989 in primary milieus (vegetated and non-vegated backwaters and impounded areas, main channels and main channel borders) in three navigation pools (8, 13 and 26) of the upper Mississippi River and in a downstream reach of the Illinois River. Ichthyoplankton densities varied among stations of similar aquatic landscapes (milieus) more than among subsamples within a station. An analysis of sampling effort indicated that the collection of single samples at many stations in a given milieu type is statistically and economically preferable to the collection of multiple subsamples at fewer stations. Cluster analyses also revealed that stations only generally grouped by their preassigned milieu types. Pilot studies such as this can define station groupings and sources of variation beyond an a priori habitat classification. Thus the minimum intensity of sampling required to achieve a desired statistical confidence can be identified before implementing monitoring efforts.
Struwe, Weston B; Agravat, Sanjay; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P; Costello, Catherine E; Dell, Anne; Ten Feizi; Haslam, Stuart M; Karlsson, Niclas G; Khoo, Kay-Hooi; Kolarich, Daniel; Liu, Yan; McBride, Ryan; Novotny, Milos V; Packer, Nicolle H; Paulson, James C; Rapp, Erdmann; Ranzinger, Rene; Rudd, Pauline M; Smith, David F; Tiemeyer, Michael; Wells, Lance; York, William S; Zaia, Joseph; Kettner, Carsten
2016-09-01
The minimum information required for a glycomics experiment (MIRAGE) project was established in 2011 to provide guidelines to aid in data reporting from all types of experiments in glycomics research including mass spectrometry (MS), liquid chromatography, glycan arrays, data handling and sample preparation. MIRAGE is a concerted effort of the wider glycomics community that considers the adaptation of reporting guidelines as an important step towards critical evaluation and dissemination of datasets as well as broadening of experimental techniques worldwide. The MIRAGE Commission published reporting guidelines for MS data and here we outline guidelines for sample preparation. The sample preparation guidelines include all aspects of sample generation, purification and modification from biological and/or synthetic carbohydrate material. The application of MIRAGE sample preparation guidelines will lead to improved recording of experimental protocols and reporting of understandable and reproducible glycomics datasets. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Comparison of the efficiency between two sampling plans for aflatoxins analysis in maize
Mallmann, Adriano Olnei; Marchioro, Alexandro; Oliveira, Maurício Schneider; Rauber, Ricardo Hummes; Dilkin, Paulo; Mallmann, Carlos Augusto
2014-01-01
Variance and performance of two sampling plans for aflatoxins quantification in maize were evaluated. Eight lots of maize were sampled using two plans: manual, using sampling spear for kernels; and automatic, using a continuous flow to collect milled maize. Total variance and sampling, preparation, and analysis variance were determined and compared between plans through multifactor analysis of variance. Four theoretical distribution models were used to compare aflatoxins quantification distributions in eight maize lots. The acceptance and rejection probabilities for a lot under certain aflatoxin concentration were determined using variance and the information on the selected distribution model to build the operational characteristic curves (OC). Sampling and total variance were lower at the automatic plan. The OC curve from the automatic plan reduced both consumer and producer risks in comparison to the manual plan. The automatic plan is more efficient than the manual one because it expresses more accurately the real aflatoxin contamination in maize. PMID:24948911
Poisson denoising on the sphere
NASA Astrophysics Data System (ADS)
Schmitt, J.; Starck, J. L.; Fadili, J.; Grenier, I.; Casandjian, J. M.
2009-08-01
In the scope of the Fermi mission, Poisson noise removal should improve data quality and make source detection easier. This paper presents a method for Poisson data denoising on sphere, called Multi-Scale Variance Stabilizing Transform on Sphere (MS-VSTS). This method is based on a Variance Stabilizing Transform (VST), a transform which aims to stabilize a Poisson data set such that each stabilized sample has an (asymptotically) constant variance. In addition, for the VST used in the method, the transformed data are asymptotically Gaussian. Thus, MS-VSTS consists in decomposing the data into a sparse multi-scale dictionary (wavelets, curvelets, ridgelets...), and then applying a VST on the coefficients in order to get quasi-Gaussian stabilized coefficients. In this present article, the used multi-scale transform is the Isotropic Undecimated Wavelet Transform. Then, hypothesis tests are made to detect significant coefficients, and the denoised image is reconstructed with an iterative method based on Hybrid Steepest Descent (HST). The method is tested on simulated Fermi data.
Xu, Henglong; Yong, Jiang; Xu, Guangjian
2015-12-30
Sampling frequency is important to obtain sufficient information for temporal research of microfauna. To determine an optimal strategy for exploring the seasonal variation in ciliated protozoa, a dataset from the Yellow Sea, northern China was studied. Samples were collected with 24 (biweekly), 12 (monthly), 8 (bimonthly per season) and 4 (seasonally) sampling events. Compared to the 24 samplings (100%), the 12-, 8- and 4-samplings recovered 94%, 94%, and 78% of the total species, respectively. To reveal the seasonal distribution, the 8-sampling regime may result in >75% information of the seasonal variance, while the traditional 4-sampling may only explain <65% of the total variance. With the increase of the sampling frequency, the biotic data showed stronger correlations with seasonal variables (e.g., temperature, salinity) in combination with nutrients. It is suggested that the 8-sampling events per year may be an optimal sampling strategy for ciliated protozoan seasonal research in marine ecosystems. Copyright © 2015 Elsevier Ltd. All rights reserved.
Re-estimating sample size in cluster randomised trials with active recruitment within clusters.
van Schie, S; Moerbeek, M
2014-08-30
Often only a limited number of clusters can be obtained in cluster randomised trials, although many potential participants can be recruited within each cluster. Thus, active recruitment is feasible within the clusters. To obtain an efficient sample size in a cluster randomised trial, the cluster level and individual level variance should be known before the study starts, but this is often not the case. We suggest using an internal pilot study design to address this problem of unknown variances. A pilot can be useful to re-estimate the variances and re-calculate the sample size during the trial. Using simulated data, it is shown that an initially low or high power can be adjusted using an internal pilot with the type I error rate remaining within an acceptable range. The intracluster correlation coefficient can be re-estimated with more precision, which has a positive effect on the sample size. We conclude that an internal pilot study design may be used if active recruitment is feasible within a limited number of clusters. Copyright © 2014 John Wiley & Sons, Ltd.
A Sparse Matrix Approach for Simultaneous Quantification of Nystagmus and Saccade
NASA Technical Reports Server (NTRS)
Kukreja, Sunil L.; Stone, Lee; Boyle, Richard D.
2012-01-01
The vestibulo-ocular reflex (VOR) consists of two intermingled non-linear subsystems; namely, nystagmus and saccade. Typically, nystagmus is analysed using a single sufficiently long signal or a concatenation of them. Saccade information is not analysed and discarded due to insufficient data length to provide consistent and minimum variance estimates. This paper presents a novel sparse matrix approach to system identification of the VOR. It allows for the simultaneous estimation of both nystagmus and saccade signals. We show via simulation of the VOR that our technique provides consistent and unbiased estimates in the presence of output additive noise.
Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.
Ritz, Christian; Van der Vliet, Leana
2009-09-01
The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.
The Effect of Cluster Sampling Design in Survey Research on the Standard Error Statistic.
ERIC Educational Resources Information Center
Wang, Lin; Fan, Xitao
Standard statistical methods are used to analyze data that is assumed to be collected using a simple random sampling scheme. These methods, however, tend to underestimate variance when the data is collected with a cluster design, which is often found in educational survey research. The purposes of this paper are to demonstrate how a cluster design…
ERIC Educational Resources Information Center
Dykiert, Dominika; Gale, Catharine R.; Deary, Ian J.
2009-01-01
This study investigated the possibility that apparent sex differences in IQ are at least partly created by the degree of sample restriction from the baseline population. We used a nationally representative sample, the 1970 British Cohort Study. Sample sizes varied from 6518 to 11,389 between data-collection sweeps. Principal components analysis of…
NASA Astrophysics Data System (ADS)
Cortesi, Nicola; Peña-Angulo, Dhais; Simolo, Claudia; Stepanek, Peter; Brunetti, Michele; Gonzalez-Hidalgo, José Carlos
2014-05-01
One of the key point in the develop of the MOTEDAS dataset (see Poster 1 MOTEDAS) in the framework of the HIDROCAES Project (Impactos Hidrológicos del Calentamiento Global en España, Spanish Ministery of Research CGL2011-27574-C02-01) is the reference series for which no generalized metadata exist. In this poster we present an analysis of spatial variability of monthly minimum and maximum temperatures in the conterminous land of Spain (Iberian Peninsula, IP), by using the Correlation Decay Distance function (CDD), with the aim of evaluating, at sub-regional level, the optimal threshold distance between neighbouring stations for producing the set of reference series used in the quality control (see MOTEDAS Poster 1) and the reconstruction (see MOREDAS Poster 3). The CDD analysis for Tmax and Tmin was performed calculating a correlation matrix at monthly scale between 1981-2010 among monthly mean values of maximum (Tmax) and minimum (Tmin) temperature series (with at least 90% of data), free of anomalous data and homogenized (see MOTEDAS Poster 1), obtained from AEMEt archives (National Spanish Meteorological Agency). Monthly anomalies (difference between data and mean 1981-2010) were used to prevent the dominant effect of annual cycle in the CDD annual estimation. For each station, and time scale, the common variance r2 (using the square of Pearson's correlation coefficient) was calculated between all neighbouring temperature series and the relation between r2 and distance was modelled according to the following equation (1): Log (r2ij) = b*°dij (1) being Log(rij2) the common variance between target (i) and neighbouring series (j), dij the distance between them and b the slope of the ordinary least-squares linear regression model applied taking into account only the surrounding stations within a starting radius of 50 km and with a minimum of 5 stations required. Finally, monthly, seasonal and annual CDD values were interpolated using the Ordinary Kriging with a spherical variogram over conterminous land of Spain, and converted on a regular 10 km2 grid (resolution similar to the mean distance between stations) to map the results. In the conterminous land of Spain the distance at which couples of stations have a common variance in temperature (both maximum Tmax, and minimum Tmin) above the selected threshold (50%, r Pearson ~0.70) on average does not exceed 400 km, with relevant spatial and temporal differences. The spatial distribution of the CDD shows a clear coastland-to-inland gradient at annual, seasonal and monthly scale, with highest spatial variability along the coastland areas and lower variability inland. The highest spatial variability coincide particularly with coastland areas surrounded by mountain chains and suggests that the orography is one of the most driving factor causing higher interstation variability. Moreover, there are some differences between the behaviour of Tmax and Tmin, being Tmin spatially more homogeneous than Tmax, but its lower CDD values indicate that night-time temperature is more variable than diurnal one. The results suggest that in general local factors affects the spatial variability of monthly Tmin more than Tmax and then higher network density would be necessary to capture the higher spatial variability highlighted for Tmin respect to Tmax. The results suggest that in general local factors affects the spatial variability of Tmin more than Tmax and then higher network density would be necessary to capture the higher spatial variability highlighted for minimum temperature respect to maximum temperature. A conservative distance for reference series could be evaluated in 200 km, that we propose for continental land of Spain and use in the development of MOTEDAS.
Convenience samples and caregiving research: how generalizable are the findings?
Pruchno, Rachel A; Brill, Jonathan E; Shands, Yvonne; Gordon, Judith R; Genderson, Maureen Wilson; Rose, Miriam; Cartwright, Francine
2008-12-01
We contrast characteristics of respondents recruited using convenience strategies with those of respondents recruited by random digit dial (RDD) methods. We compare sample variances, means, and interrelationships among variables generated from the convenience and RDD samples. Women aged 50 to 64 who work full time and provide care to a community-dwelling older person were recruited using either RDD (N = 55) or convenience methods (N = 87). Telephone interviews were conducted using reliable, valid measures of demographics, characteristics of the care recipient, help provided to the care recipient, evaluations of caregiver-care recipient relationship, and outcomes common to caregiving research. Convenience and RDD samples had similar variances on 68.4% of the examined variables. We found significant mean differences for 63% of the variables examined. Bivariate correlations suggest that one would reach different conclusions using the convenience and RDD sample data sets. Researchers should use convenience samples cautiously, as they may have limited generalizability.
Vista, Alvin; Care, Esther
2011-06-01
Research on gender differences in intelligence has focused mostly on samples from Western countries and empirical evidence on gender differences from Southeast Asia is relatively sparse. This article presents results on gender differences in variance and means on a non-verbal intelligence test using a national sample of public school students from the Philippines. More than 2,700 sixth graders from public schools across the country were tested with the Naglieri Non-verbal Ability Test (NNAT). Variance ratios (VRs) and log-transformed VRs were computed. Proportion ratios for each of the ability levels were also calculated and a chi-square goodness-of-fit test was performed. An analysis of variance was performed to determine the overall gender difference in mean scores as well as within each of three age subgroups. Our data show non-existent or trivial gender difference in mean scores. However, the tails of the distributions show differences between the males and females, with greater variability among males in the upper half of the distribution and greater variability among females in the lower half of the distribution. Descriptions of the results and their implications are discussed. Results on mean score differences support the hypothesis that there are no significant gender differences in cognitive ability. The unusual results regarding differences in variance and the male-female proportion in the tails require more complex investigations. ©2010 The British Psychological Society.
Gini estimation under infinite variance
NASA Astrophysics Data System (ADS)
Fontanari, Andrea; Taleb, Nassim Nicholas; Cirillo, Pasquale
2018-07-01
We study the problems related to the estimation of the Gini index in presence of a fat-tailed data generating process, i.e. one in the stable distribution class with finite mean but infinite variance (i.e. with tail index α ∈(1 , 2)). We show that, in such a case, the Gini coefficient cannot be reliably estimated using conventional nonparametric methods, because of a downward bias that emerges under fat tails. This has important implications for the ongoing discussion about economic inequality. We start by discussing how the nonparametric estimator of the Gini index undergoes a phase transition in the symmetry structure of its asymptotic distribution, as the data distribution shifts from the domain of attraction of a light-tailed distribution to that of a fat-tailed one, especially in the case of infinite variance. We also show how the nonparametric Gini bias increases with lower values of α. We then prove that maximum likelihood estimation outperforms nonparametric methods, requiring a much smaller sample size to reach efficiency. Finally, for fat-tailed data, we provide a simple correction mechanism to the small sample bias of the nonparametric estimator based on the distance between the mode and the mean of its asymptotic distribution.
Estimating means and variances: The comparative efficiency of composite and grab samples.
Brumelle, S; Nemetz, P; Casey, D
1984-03-01
This paper compares the efficiencies of two sampling techniques for estimating a population mean and variance. One procedure, called grab sampling, consists of collecting and analyzing one sample per period. The second procedure, called composite sampling, collectsn samples per period which are then pooled and analyzed as a single sample. We review the well known fact that composite sampling provides a superior estimate of the mean. However, it is somewhat surprising that composite sampling does not always generate a more efficient estimate of the variance. For populations with platykurtic distributions, grab sampling gives a more efficient estimate of the variance, whereas composite sampling is better for leptokurtic distributions. These conditions on kurtosis can be related to peakedness and skewness. For example, a necessary condition for composite sampling to provide a more efficient estimate of the variance is that the population density function evaluated at the mean (i.e.f(μ)) be greater than[Formula: see text]. If[Formula: see text], then a grab sample is more efficient. In spite of this result, however, composite sampling does provide a smaller estimate of standard error than does grab sampling in the context of estimating population means.
Data mining on long-term barometric data within the ARISE2 project
NASA Astrophysics Data System (ADS)
Hupe, Patrick; Ceranna, Lars; Pilger, Christoph
2016-04-01
The Comprehensive nuclear-Test-Ban Treaty (CTBT) led to the implementation of an international infrasound array network. The International Monitoring System (IMS) network includes 48 certified stations, each providing data for up to 15 years. As part of work package 3 of the ARISE2 project (Atmospheric dynamics Research InfraStructure in Europe, phase 2) the data sets will be statistically evaluated with regard on atmospheric dynamics. The current study focusses on fluctuations of absolute air pressure. Time series have been analysed for 17 monitoring stations which are located all over the world between Greenland and Antarctica along the latitudes to represent different climate zones and characteristic atmospheric conditions. Hence this enables quantitative comparisons between those regions. Analyses are shown including wavelet power spectra, multi-annual time series of average variances with regard to long-wave scales, and spectral densities to derive characteristics and special events. Evaluations reveal periodicities in average variances on 2 to 20 day scale with a maximum in the winter months and a minimum in summer of the respective hemisphere. This basically applies to time series of IMS stations beyond the tropics where the dominance of cyclones and anticyclones changes with seasons. Furthermore, spectral density analyses illustrate striking signals for several dynamic activities within one day, e.g., the semidiurnal tide.
Electron Pitch-Angle Distribution in Pressure Balance Structures Measured by Ulysses/SWOOPS
NASA Technical Reports Server (NTRS)
Yamauchi, Yohei; Suess, Steven T.; Sakurai, Takashi; Six, N. Frank (Technical Monitor)
2002-01-01
Pressure balance structures (PBSs) are a common feature in the high-latitude solar wind near solar minimum. From previous studies, PBSs are believed to be remnants of coronal plumes. Yamauchi et al [2002] investigated the magnetic structures of the PBSs, applying a minimum variance analysis to Ulysses/Magnetometer data. They found that PBSs contain structures like current sheets or plasmoids, and suggested that PBSs are associated with network activity such as magnetic reconnection in the photosphere at the base of polar plumes. We have investigated energetic electron data from Ulysses/SWOOPS to see whether bi-directional electron flow exists and we have found evidence supporting the earlier conclusions. We find that 45 ot of 53 PBSs show local bi-directional or isotopic electron flux or flux associated with current-sheet structure. Only five events show the pitch-angle distribution expected for Alfvenic fluctuations. We conclude that PBSs do contain magnetic structures such as current sheets or plasmoids that are expected as a result of network activity at the base of polar plumes.
Normative morphometric data for cerebral cortical areas over the lifetime of the adult human brain.
Potvin, Olivier; Dieumegarde, Louis; Duchesne, Simon
2017-08-01
Proper normative data of anatomical measurements of cortical regions, allowing to quantify brain abnormalities, are lacking. We developed norms for regional cortical surface areas, thicknesses, and volumes based on cross-sectional MRI scans from 2713 healthy individuals aged 18 to 94 years using 23 samples provided by 21 independent research groups. The segmentation was conducted using FreeSurfer, a widely used and freely available automated segmentation software. Models predicting regional cortical estimates of each hemisphere were produced using age, sex, estimated total intracranial volume (eTIV), scanner manufacturer, magnetic field strength, and interactions as predictors. The explained variance for the left/right cortex was 76%/76% for surface area, 43%/42% for thickness, and 80%/80% for volume. The mean explained variance for all regions was 41% for surface areas, 27% for thicknesses, and 46% for volumes. Age, sex and eTIV predicted most of the explained variance for surface areas and volumes while age was the main predictors for thicknesses. Scanner characteristics generally predicted a limited amount of variance, but this effect was stronger for thicknesses than surface areas and volumes. For new individuals, estimates of their expected surface area, thickness and volume based on their characteristics and the scanner characteristics can be obtained using the derived formulas, as well as Z score effect sizes denoting the extent of the deviation from the normative sample. Models predicting normative values were validated in independent samples of healthy adults, showing satisfactory validation R 2 . Deviations from the normative sample were measured in individuals with mild Alzheimer's disease and schizophrenia and expected patterns of deviations were observed. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
A new statistic to express the uncertainty of kriging predictions for purposes of survey planning.
NASA Astrophysics Data System (ADS)
Lark, R. M.; Lapworth, D. J.
2014-05-01
It is well-known that one advantage of kriging for spatial prediction is that, given the random effects model, the prediction error variance can be computed a priori for alternative sampling designs. This allows one to compare sampling schemes, in particular sampling at different densities, and so to decide on one which meets requirements in terms of the uncertainty of the resulting predictions. However, the planning of sampling schemes must account not only for statistical considerations, but also logistics and cost. This requires effective communication between statisticians, soil scientists and data users/sponsors such as managers, regulators or civil servants. In our experience the latter parties are not necessarily able to interpret the prediction error variance as a measure of uncertainty for decision making. In some contexts (particularly the solution of very specific problems at large cartographic scales, e.g. site remediation and precision farming) it is possible to translate uncertainty of predictions into a loss function directly comparable with the cost incurred in increasing precision. Often, however, sampling must be planned for more generic purposes (e.g. baseline or exploratory geochemical surveys). In this latter context the prediction error variance may be of limited value to a non-statistician who has to make a decision on sample intensity and associated cost. We propose an alternative criterion for these circumstances to aid communication between statisticians and data users about the uncertainty of geostatistical surveys based on different sampling intensities. The criterion is the consistency of estimates made from two non-coincident instantiations of a proposed sample design. We consider square sample grids, one instantiation is offset from the second by half the grid spacing along the rows and along the columns. If a sample grid is coarse relative to the important scales of variation in the target property then the consistency of predictions from two instantiations is expected to be small, and can be increased by reducing the grid spacing. The measure of consistency is the correlation between estimates from the two instantiations of the sample grid, averaged over a grid cell. We call this the offset correlation, it can be calculated from the variogram. We propose that this measure is easier to grasp intuitively than the prediction error variance, and has the advantage of having an upper bound (1.0) which will aid its interpretation. This quality measure is illustrated for some hypothetical examples, considering both ordinary kriging and factorial kriging of the variable of interest. It is also illustrated using data on metal concentrations in the soil of north-east England.
The relationship between observational scale and explained variance in benthic communities
Flood, Roger D.; Frisk, Michael G.; Garza, Corey D.; Lopez, Glenn R.; Maher, Nicole P.
2018-01-01
This study addresses the impact of spatial scale on explaining variance in benthic communities. In particular, the analysis estimated the fraction of community variation that occurred at a spatial scale smaller than the sampling interval (i.e., the geographic distance between samples). This estimate is important because it sets a limit on the amount of community variation that can be explained based on the spatial configuration of a study area and sampling design. Six benthic data sets were examined that consisted of faunal abundances, common environmental variables (water depth, grain size, and surficial percent cover), and sonar backscatter treated as a habitat proxy (categorical acoustic provinces). Redundancy analysis was coupled with spatial variograms generated by multiscale ordination to quantify the explained and residual variance at different spatial scales and within and between acoustic provinces. The amount of community variation below the sampling interval of the surveys (< 100 m) was estimated to be 36–59% of the total. Once adjusted for this small-scale variation, > 71% of the remaining variance was explained by the environmental and province variables. Furthermore, these variables effectively explained the spatial structure present in the infaunal community. Overall, no scale problems remained to compromise inferences, and unexplained infaunal community variation had no apparent spatial structure within the observational scale of the surveys (> 100 m), although small-scale gradients (< 100 m) below the observational scale may be present. PMID:29324746
Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J; Murray, David M; Muller, Keith E; Glueck, Deborah H
2015-11-30
We used theoretical and simulation-based approaches to study Type I error rates for one-stage and two-stage analytic methods for cluster-randomized designs. The one-stage approach uses the observed data as outcomes and accounts for within-cluster correlation using a general linear mixed model. The two-stage model uses the cluster specific means as the outcomes in a general linear univariate model. We demonstrate analytically that both one-stage and two-stage models achieve exact Type I error rates when cluster sizes are equal. With unbalanced data, an exact size α test does not exist, and Type I error inflation may occur. Via simulation, we compare the Type I error rates for four one-stage and six two-stage hypothesis testing approaches for unbalanced data. With unbalanced data, the two-stage model, weighted by the inverse of the estimated theoretical variance of the cluster means, and with variance constrained to be positive, provided the best Type I error control for studies having at least six clusters per arm. The one-stage model with Kenward-Roger degrees of freedom and unconstrained variance performed well for studies having at least 14 clusters per arm. The popular analytic method of using a one-stage model with denominator degrees of freedom appropriate for balanced data performed poorly for small sample sizes and low intracluster correlation. Because small sample sizes and low intracluster correlation are common features of cluster-randomized trials, the Kenward-Roger method is the preferred one-stage approach. Copyright © 2015 John Wiley & Sons, Ltd.
Gray, Brian R.; Gitzen, Robert A.; Millspaugh, Joshua J.; Cooper, Andrew B.; Licht, Daniel S.
2012-01-01
Variance components may play multiple roles (cf. Cox and Solomon 2003). First, magnitudes and relative magnitudes of the variances of random factors may have important scientific and management value in their own right. For example, variation in levels of invasive vegetation among and within lakes may suggest causal agents that operate at both spatial scales – a finding that may be important for scientific and management reasons. Second, variance components may also be of interest when they affect precision of means and covariate coefficients. For example, variation in the effect of water depth on the probability of aquatic plant presence in a study of multiple lakes may vary by lake. This variation will affect the precision of the average depth-presence association. Third, variance component estimates may be used when designing studies, including monitoring programs. For example, to estimate the numbers of years and of samples per year required to meet long-term monitoring goals, investigators need estimates of within and among-year variances. Other chapters in this volume (Chapters 7, 8, and 10) as well as extensive external literature outline a framework for applying estimates of variance components to the design of monitoring efforts. For example, a series of papers with an ecological monitoring theme examined the relative importance of multiple sources of variation, including variation in means among sites, years, and site-years, for the purposes of temporal trend detection and estimation (Larsen et al. 2004, and references therein).
Information-Theoretic Assessment of Sample Imaging Systems
NASA Technical Reports Server (NTRS)
Huck, Friedrich O.; Alter-Gartenberg, Rachel; Park, Stephen K.; Rahman, Zia-ur
1999-01-01
By rigorously extending modern communication theory to the assessment of sampled imaging systems, we develop the formulations that are required to optimize the performance of these systems within the critical constraints of image gathering, data transmission, and image display. The goal of this optimization is to produce images with the best possible visual quality for the wide range of statistical properties of the radiance field of natural scenes that one normally encounters. Extensive computational results are presented to assess the performance of sampled imaging systems in terms of information rate, theoretical minimum data rate, and fidelity. Comparisons of this assessment with perceptual and measurable performance demonstrate that (1) the information rate that a sampled imaging system conveys from the captured radiance field to the observer is closely correlated with the fidelity, sharpness and clarity with which the observed images can be restored and (2) the associated theoretical minimum data rate is closely correlated with the lowest data rate with which the acquired signal can be encoded for efficient transmission.
Shrot, Yoav; Frydman, Lucio
2011-04-01
A topic of active investigation in 2D NMR relates to the minimum number of scans required for acquiring this kind of spectra, particularly when these are dictated by sampling rather than by sensitivity considerations. Reductions in this minimum number of scans have been achieved by departing from the regular sampling used to monitor the indirect domain, and relying instead on non-uniform sampling and iterative reconstruction algorithms. Alternatively, so-called "ultrafast" methods can compress the minimum number of scans involved in 2D NMR all the way to a minimum number of one, by spatially encoding the indirect domain information and subsequently recovering it via oscillating field gradients. Given ultrafast NMR's simultaneous recording of the indirect- and direct-domain data, this experiment couples the spectral constraints of these orthogonal domains - often calling for the use of strong acquisition gradients and large filter widths to fulfill the desired bandwidth and resolution demands along all spectral dimensions. This study discusses a way to alleviate these demands, and thereby enhance the method's performance and applicability, by combining spatial encoding with iterative reconstruction approaches. Examples of these new principles are given based on the compressed-sensed reconstruction of biomolecular 2D HSQC ultrafast NMR data, an approach that we show enables a decrease of the gradient strengths demanded in this type of experiments by up to 80%. Copyright © 2011 Elsevier Inc. All rights reserved.
Estimating the mass variance in neutron multiplicity counting-A comparison of approaches
NASA Astrophysics Data System (ADS)
Dubi, C.; Croft, S.; Favalli, A.; Ocherashvili, A.; Pedersen, B.
2017-12-01
In the standard practice of neutron multiplicity counting , the first three sampled factorial moments of the event triggered neutron count distribution are used to quantify the three main neutron source terms: the spontaneous fissile material effective mass, the relative (α , n) production and the induced fission source responsible for multiplication. This study compares three methods to quantify the statistical uncertainty of the estimated mass: the bootstrap method, propagation of variance through moments, and statistical analysis of cycle data method. Each of the three methods was implemented on a set of four different NMC measurements, held at the JRC-laboratory in Ispra, Italy, sampling four different Pu samples in a standard Plutonium Scrap Multiplicity Counter (PSMC) well counter.
Estimating the mass variance in neutron multiplicity counting $-$ A comparison of approaches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubi, C.; Croft, S.; Favalli, A.
In the standard practice of neutron multiplicity counting, the first three sampled factorial moments of the event triggered neutron count distribution are used to quantify the three main neutron source terms: the spontaneous fissile material effective mass, the relative (α,n) production and the induced fission source responsible for multiplication. This study compares three methods to quantify the statistical uncertainty of the estimated mass: the bootstrap method, propagation of variance through moments, and statistical analysis of cycle data method. Each of the three methods was implemented on a set of four different NMC measurements, held at the JRC-laboratory in Ispra, Italy,more » sampling four different Pu samples in a standard Plutonium Scrap Multiplicity Counter (PSMC) well counter.« less
Estimating the mass variance in neutron multiplicity counting $-$ A comparison of approaches
Dubi, C.; Croft, S.; Favalli, A.; ...
2017-09-14
In the standard practice of neutron multiplicity counting, the first three sampled factorial moments of the event triggered neutron count distribution are used to quantify the three main neutron source terms: the spontaneous fissile material effective mass, the relative (α,n) production and the induced fission source responsible for multiplication. This study compares three methods to quantify the statistical uncertainty of the estimated mass: the bootstrap method, propagation of variance through moments, and statistical analysis of cycle data method. Each of the three methods was implemented on a set of four different NMC measurements, held at the JRC-laboratory in Ispra, Italy,more » sampling four different Pu samples in a standard Plutonium Scrap Multiplicity Counter (PSMC) well counter.« less
New trends in gender and mathematics performance: a meta-analysis.
Lindberg, Sara M; Hyde, Janet Shibley; Petersen, Jennifer L; Linn, Marcia C
2010-11-01
In this article, we use meta-analysis to analyze gender differences in recent studies of mathematics performance. First, we meta-analyzed data from 242 studies published between 1990 and 2007, representing the testing of 1,286,350 people. Overall, d = 0.05, indicating no gender difference, and variance ratio = 1.08, indicating nearly equal male and female variances. Second, we analyzed data from large data sets based on probability sampling of U.S. adolescents over the past 20 years: the National Longitudinal Surveys of Youth, the National Education Longitudinal Study of 1988, the Longitudinal Study of American Youth, and the National Assessment of Educational Progress. Effect sizes for the gender difference ranged between -0.15 and +0.22. Variance ratios ranged from 0.88 to 1.34. Taken together, these findings support the view that males and females perform similarly in mathematics.
Planer, Katarina; Hagel, Anja
2018-01-01
A validity test was conducted to determine how care level–based nurse-to-resident ratios compare with actual daily care times per resident in Germany. Stability across different long-term care facilities was tested. Care level–based nurse-to-resident ratios were compared with the standard minimum nurse-to-resident ratios. Levels of care are determined by classification authorities in long-term care insurance programs and are used to distribute resources. Care levels are a powerful tool for classifying authorities in long-term care insurance. We used observer-based measurement of assignable direct and indirect care time in 68 nursing units for 2028 residents across 2 working days. Organizational data were collected at the end of the quarter in which the observation was made. Data were collected from January to March, 2012. We used a null multilevel model with random intercepts and multilevel models with fixed and random slopes to analyze data at both the organization and resident levels. A total of 14% of the variance in total care time per day was explained by membership in nursing units. The impact of care levels on care time differed significantly between nursing units. Forty percent of residents at the lowest care level received less than the standard minimum registered nursing time per day. For facilities that have been significantly disadvantaged in the current staffing system, a higher minimum standard will function more effectively than a complex classification system without scientific controls. PMID:29442533
Brühl, Albert; Planer, Katarina; Hagel, Anja
2018-01-01
A validity test was conducted to determine how care level-based nurse-to-resident ratios compare with actual daily care times per resident in Germany. Stability across different long-term care facilities was tested. Care level-based nurse-to-resident ratios were compared with the standard minimum nurse-to-resident ratios. Levels of care are determined by classification authorities in long-term care insurance programs and are used to distribute resources. Care levels are a powerful tool for classifying authorities in long-term care insurance. We used observer-based measurement of assignable direct and indirect care time in 68 nursing units for 2028 residents across 2 working days. Organizational data were collected at the end of the quarter in which the observation was made. Data were collected from January to March, 2012. We used a null multilevel model with random intercepts and multilevel models with fixed and random slopes to analyze data at both the organization and resident levels. A total of 14% of the variance in total care time per day was explained by membership in nursing units. The impact of care levels on care time differed significantly between nursing units. Forty percent of residents at the lowest care level received less than the standard minimum registered nursing time per day. For facilities that have been significantly disadvantaged in the current staffing system, a higher minimum standard will function more effectively than a complex classification system without scientific controls.
40 CFR 63.8 - Monitoring requirements.
Code of Federal Regulations, 2011 CFR
2011-07-01
... operation requirements as follows: (i) All COMS shall complete a minimum of one cycle of sampling and analyzing for each successive 10-second period and one cycle of data recording for each successive 6-minute period. (ii) All CEMS for measuring emissions other than opacity shall complete a minimum of one cycle of...
40 CFR 1065.546 - Validation of minimum dilution ratio for PM batch sampling.
Code of Federal Regulations, 2010 CFR
2010-07-01
... flows and/or tracer gas concentrations for transient and ramped modal cycles to validate the minimum... mode-average values instead of continuous measurements for discrete mode steady-state duty cycles... molar flow data. This involves determination of at least two of the following three quantities: Raw...
Range and azimuth resolution enhancement for 94 GHz real-beam radar
NASA Astrophysics Data System (ADS)
Liu, Guoqing; Yang, Ken; Sykora, Brian; Salha, Imad
2008-04-01
In this paper, two-dimensional (2D) (range and azimuth) resolution enhancement is investigated for millimeter wave (mmW) real-beam radar (RBR) with linear or non-linear antenna scan in the azimuth dimension. We design a new architecture of super resolution processing, in which a dual-mode approach is used for defining region of interest for 2D resolution enhancement and a combined approach is deployed for obtaining accurate location and amplitude estimations of targets within the region of interest. To achieve 2D resolution enhancement, we first adopt the Capon Beamformer (CB) approach (also known as the minimum variance method (MVM)) to enhance range resolution. A generalized CB (GCB) approach is then applied to azimuth dimension for azimuth resolution enhancement. The GCB approach does not rely on whether the azimuth sampling is even or not and thus can be used in both linear and non-linear antenna scanning modes. The effectiveness of the resolution enhancement is demonstrated by using both simulation and test data. The results of using a 94 GHz real-beam frequency modulation continuous wave (FMCW) radar data show that the overall image quality is significantly improved per visual evaluation and comparison with respect to the original real-beam radar image.
Effective Analysis of Reaction Time Data
ERIC Educational Resources Information Center
Whelan, Robert
2008-01-01
Most analyses of reaction time (RT) data are conducted by using the statistical techniques with which psychologists are most familiar, such as analysis of variance on the sample mean. Unfortunately, these methods are usually inappropriate for RT data, because they have little power to detect genuine differences in RT between conditions. In…
Robertson, David S; Prevost, A Toby; Bowden, Jack
2016-09-30
Seamless phase II/III clinical trials offer an efficient way to select an experimental treatment and perform confirmatory analysis within a single trial. However, combining the data from both stages in the final analysis can induce bias into the estimates of treatment effects. Methods for bias adjustment developed thus far have made restrictive assumptions about the design and selection rules followed. In order to address these shortcomings, we apply recent methodological advances to derive the uniformly minimum variance conditionally unbiased estimator for two-stage seamless phase II/III trials. Our framework allows for the precision of the treatment arm estimates to take arbitrary values, can be utilised for all treatments that are taken forward to phase III and is applicable when the decision to select or drop treatment arms is driven by a multiplicity-adjusted hypothesis testing procedure. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Basu, Nandita B.; Fure, Adrian D.; Jawitz, James W.
2008-07-01
Simulations of nonpartitioning and partitioning tracer tests were used to parameterize the equilibrium stream tube model (ESM) that predicts the dissolution dynamics of dense nonaqueous phase liquids (DNAPLs) as a function of the Lagrangian properties of DNAPL source zones. Lagrangian, or stream-tube-based, approaches characterize source zones with as few as two trajectory-integrated parameters, in contrast to the potentially thousands of parameters required to describe the point-by-point variability in permeability and DNAPL in traditional Eulerian modeling approaches. The spill and subsequent dissolution of DNAPLs were simulated in two-dimensional domains having different hydrologic characteristics (variance of the log conductivity field = 0.2, 1, and 3) using the multiphase flow and transport simulator UTCHEM. Nonpartitioning and partitioning tracers were used to characterize the Lagrangian properties (travel time and trajectory-integrated DNAPL content statistics) of DNAPL source zones, which were in turn shown to be sufficient for accurate prediction of source dissolution behavior using the ESM throughout the relatively broad range of hydraulic conductivity variances tested here. The results were found to be relatively insensitive to travel time variability, suggesting that dissolution could be accurately predicted even if the travel time variance was only coarsely estimated. Estimation of the ESM parameters was also demonstrated using an approximate technique based on Eulerian data in the absence of tracer data; however, determining the minimum amount of such data required remains for future work. Finally, the stream tube model was shown to be a more unique predictor of dissolution behavior than approaches based on the ganglia-to-pool model for source zone characterization.
Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku
2013-09-01
Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p < 0.01), bizygomatic breadth (p < 0.01) and head height (p < 0.05), and a negative relationship between CI and morphological facial height (p < 0.01) and head circumference (p < 0.01). Moreover, the coefficient and odds ratio of logistic regression analysis showed a greater likelihood for minimum frontal breadth (p < 0.01) and bizygomatic breadth (p < 0.01) to predict round-headedness, and morphological facial height (p < 0.05) and head circumference (p < 0.01) to predict long-headedness. Stepwise regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.
An analytic technique for statistically modeling random atomic clock errors in estimation
NASA Technical Reports Server (NTRS)
Fell, P. J.
1981-01-01
Minimum variance estimation requires that the statistics of random observation errors be modeled properly. If measurements are derived through the use of atomic frequency standards, then one source of error affecting the observable is random fluctuation in frequency. This is the case, for example, with range and integrated Doppler measurements from satellites of the Global Positioning and baseline determination for geodynamic applications. An analytic method is presented which approximates the statistics of this random process. The procedure starts with a model of the Allan variance for a particular oscillator and develops the statistics of range and integrated Doppler measurements. A series of five first order Markov processes is used to approximate the power spectral density obtained from the Allan variance.
SPA- STATISTICAL PACKAGE FOR TIME AND FREQUENCY DOMAIN ANALYSIS
NASA Technical Reports Server (NTRS)
Brownlow, J. D.
1994-01-01
The need for statistical analysis often arises when data is in the form of a time series. This type of data is usually a collection of numerical observations made at specified time intervals. Two kinds of analysis may be performed on the data. First, the time series may be treated as a set of independent observations using a time domain analysis to derive the usual statistical properties including the mean, variance, and distribution form. Secondly, the order and time intervals of the observations may be used in a frequency domain analysis to examine the time series for periodicities. In almost all practical applications, the collected data is actually a mixture of the desired signal and a noise signal which is collected over a finite time period with a finite precision. Therefore, any statistical calculations and analyses are actually estimates. The Spectrum Analysis (SPA) program was developed to perform a wide range of statistical estimation functions. SPA can provide the data analyst with a rigorous tool for performing time and frequency domain studies. In a time domain statistical analysis the SPA program will compute the mean variance, standard deviation, mean square, and root mean square. It also lists the data maximum, data minimum, and the number of observations included in the sample. In addition, a histogram of the time domain data is generated, a normal curve is fit to the histogram, and a goodness-of-fit test is performed. These time domain calculations may be performed on both raw and filtered data. For a frequency domain statistical analysis the SPA program computes the power spectrum, cross spectrum, coherence, phase angle, amplitude ratio, and transfer function. The estimates of the frequency domain parameters may be smoothed with the use of Hann-Tukey, Hamming, Barlett, or moving average windows. Various digital filters are available to isolate data frequency components. Frequency components with periods longer than the data collection interval are removed by least-squares detrending. As many as ten channels of data may be analyzed at one time. Both tabular and plotted output may be generated by the SPA program. This program is written in FORTRAN IV and has been implemented on a CDC 6000 series computer with a central memory requirement of approximately 142K (octal) of 60 bit words. This core requirement can be reduced by segmentation of the program. The SPA program was developed in 1978.
Why you cannot transform your way out of trouble for small counts.
Warton, David I
2018-03-01
While data transformation is a common strategy to satisfy linear modeling assumptions, a theoretical result is used to show that transformation cannot reasonably be expected to stabilize variances for small counts. Under broad assumptions, as counts get smaller, it is shown that the variance becomes proportional to the mean under monotonic transformations g(·) that satisfy g(0)=0, excepting a few pathological cases. A suggested rule-of-thumb is that if many predicted counts are less than one then data transformation cannot reasonably be expected to stabilize variances, even for a well-chosen transformation. This result has clear implications for the analysis of counts as often implemented in the applied sciences, but particularly for multivariate analysis in ecology. Multivariate discrete data are often collected in ecology, typically with a large proportion of zeros, and it is currently widespread to use methods of analysis that do not account for differences in variance across observations nor across responses. Simulations demonstrate that failure to account for the mean-variance relationship can have particularly severe consequences in this context, and also in the univariate context if the sampling design is unbalanced. © 2017 The Authors. Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.
Li, Peng; Redden, David T.
2014-01-01
SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738
What's in a Day? A Guide to Decomposing the Variance in Intensive Longitudinal Data
de Haan-Rietdijk, Silvia; Kuppens, Peter; Hamaker, Ellen L.
2016-01-01
In recent years there has been a growing interest in the use of intensive longitudinal research designs to study within-person processes. Examples are studies that use experience sampling data and autoregressive modeling to investigate emotion dynamics and between-person differences therein. Such designs often involve multiple measurements per day and multiple days per person, and it is not clear how this nesting of the data should be accounted for: That is, should such data be considered as two-level data (which is common practice at this point), with occasions nested in persons, or as three-level data with beeps nested in days which are nested in persons. We show that a significance test of the day-level variance in an empty three-level model is not reliable when there is autocorrelation. Furthermore, we show that misspecifying the number of levels can lead to spurious or misleading findings, such as inflated variance or autoregression estimates. Throughout the paper we present instructions and R code for the implementation of the proposed models, which includes a novel three-level AR(1) model that estimates moment-to-moment inertia and day-to-day inertia. Based on our simulations we recommend model selection using autoregressive multilevel models in combination with the AIC. We illustrate this method using empirical emotion data from two independent samples, and discuss the implications and the relevance of the existence of a day level for the field. PMID:27378986
What's in a Day? A Guide to Decomposing the Variance in Intensive Longitudinal Data.
de Haan-Rietdijk, Silvia; Kuppens, Peter; Hamaker, Ellen L
2016-01-01
In recent years there has been a growing interest in the use of intensive longitudinal research designs to study within-person processes. Examples are studies that use experience sampling data and autoregressive modeling to investigate emotion dynamics and between-person differences therein. Such designs often involve multiple measurements per day and multiple days per person, and it is not clear how this nesting of the data should be accounted for: That is, should such data be considered as two-level data (which is common practice at this point), with occasions nested in persons, or as three-level data with beeps nested in days which are nested in persons. We show that a significance test of the day-level variance in an empty three-level model is not reliable when there is autocorrelation. Furthermore, we show that misspecifying the number of levels can lead to spurious or misleading findings, such as inflated variance or autoregression estimates. Throughout the paper we present instructions and R code for the implementation of the proposed models, which includes a novel three-level AR(1) model that estimates moment-to-moment inertia and day-to-day inertia. Based on our simulations we recommend model selection using autoregressive multilevel models in combination with the AIC. We illustrate this method using empirical emotion data from two independent samples, and discuss the implications and the relevance of the existence of a day level for the field.
NASA Technical Reports Server (NTRS)
Kashlinsky, A.
1992-01-01
It is shown here that, by using galaxy catalog correlation data as input, measurements of microwave background radiation (MBR) anisotropies should soon be able to test two of the inflationary scenario's most basic predictions: (1) that the primordial density fluctuations produced were scale-invariant and (2) that the universe is flat. They should also be able to detect anisotropies of large-scale structure formed by gravitational evolution of density fluctuations present at the last scattering epoch. Computations of MBR anisotropies corresponding to the minimum of the large-scale variance of the MBR anisotropy are presented which favor an open universe with P(k) significantly different from the Harrison-Zeldovich spectrum predicted by most inflationary models.
Intermediate energy proton-deuteron elastic scattering
NASA Technical Reports Server (NTRS)
Wilson, J. W.
1973-01-01
A fully symmetrized multiple scattering series is considered for the description of proton-deuteron elastic scattering. An off-shell continuation of the experimentally known twobody amplitudes that retains the exchange symmeteries required for the calculation is presented. The one boson exchange terms of the two body amplitudes are evaluated exactly in this off-shell prescription. The first two terms of the multiple scattering series are calculated explicitly whereas multiple scattering effects are obtained as minimum variance estimates from the 146-MeV data of Postma and Wilson. The multiple scattering corrections indeed consist of low order partial waves as suggested by Sloan based on model studies with separable interactions. The Hamada-Johnston wave function is shown consistent with the data for internucleon distances greater than about 0.84 fm.
An improved state-parameter analysis of ecosystem models using data assimilation
Chen, M.; Liu, S.; Tieszen, L.L.; Hollinger, D.Y.
2008-01-01
Much of the effort spent in developing data assimilation methods for carbon dynamics analysis has focused on estimating optimal values for either model parameters or state variables. The main weakness of estimating parameter values alone (i.e., without considering state variables) is that all errors from input, output, and model structure are attributed to model parameter uncertainties. On the other hand, the accuracy of estimating state variables may be lowered if the temporal evolution of parameter values is not incorporated. This research develops a smoothed ensemble Kalman filter (SEnKF) by combining ensemble Kalman filter with kernel smoothing technique. SEnKF has following characteristics: (1) to estimate simultaneously the model states and parameters through concatenating unknown parameters and state variables into a joint state vector; (2) to mitigate dramatic, sudden changes of parameter values in parameter sampling and parameter evolution process, and control narrowing of parameter variance which results in filter divergence through adjusting smoothing factor in kernel smoothing algorithm; (3) to assimilate recursively data into the model and thus detect possible time variation of parameters; and (4) to address properly various sources of uncertainties stemming from input, output and parameter uncertainties. The SEnKF is tested by assimilating observed fluxes of carbon dioxide and environmental driving factor data from an AmeriFlux forest station located near Howland, Maine, USA, into a partition eddy flux model. Our analysis demonstrates that model parameters, such as light use efficiency, respiration coefficients, minimum and optimum temperatures for photosynthetic activity, and others, are highly constrained by eddy flux data at daily-to-seasonal time scales. The SEnKF stabilizes parameter values quickly regardless of the initial values of the parameters. Potential ecosystem light use efficiency demonstrates a strong seasonality. Results show that the simultaneous parameter estimation procedure significantly improves model predictions. Results also show that the SEnKF can dramatically reduce the variance in state variables stemming from the uncertainty of parameters and driving variables. The SEnKF is a robust and effective algorithm in evaluating and developing ecosystem models and in improving the understanding and quantification of carbon cycle parameters and processes. ?? 2008 Elsevier B.V.
On the impact of relatedness on SNP association analysis.
Gross, Arnd; Tönjes, Anke; Scholz, Markus
2017-12-06
When testing for SNP (single nucleotide polymorphism) associations in related individuals, observations are not independent. Simple linear regression assuming independent normally distributed residuals results in an increased type I error and the power of the test is also affected in a more complicate manner. Inflation of type I error is often successfully corrected by genomic control. However, this reduces the power of the test when relatedness is of concern. In the present paper, we derive explicit formulae to investigate how heritability and strength of relatedness contribute to variance inflation of the effect estimate of the linear model. Further, we study the consequences of variance inflation on hypothesis testing and compare the results with those of genomic control correction. We apply the developed theory to the publicly available HapMap trio data (N=129), the Sorbs (a self-contained population with N=977 characterised by a cryptic relatedness structure) and synthetic family studies with different sample sizes (ranging from N=129 to N=999) and different degrees of relatedness. We derive explicit and easily to apply approximation formulae to estimate the impact of relatedness on the variance of the effect estimate of the linear regression model. Variance inflation increases with increasing heritability. Relatedness structure also impacts the degree of variance inflation as shown for example family structures. Variance inflation is smallest for HapMap trios, followed by a synthetic family study corresponding to the trio data but with larger sample size than HapMap. Next strongest inflation is observed for the Sorbs, and finally, for a synthetic family study with a more extreme relatedness structure but with similar sample size as the Sorbs. Type I error increases rapidly with increasing inflation. However, for smaller significance levels, power increases with increasing inflation while the opposite holds for larger significance levels. When genomic control is applied, type I error is preserved while power decreases rapidly with increasing variance inflation. Stronger relatedness as well as higher heritability result in increased variance of the effect estimate of simple linear regression analysis. While type I error rates are generally inflated, the behaviour of power is more complex since power can be increased or reduced in dependence on relatedness and the heritability of the phenotype. Genomic control cannot be recommended to deal with inflation due to relatedness. Although it preserves type I error, the loss in power can be considerable. We provide a simple formula for estimating variance inflation given the relatedness structure and the heritability of a trait of interest. As a rule of thumb, variance inflation below 1.05 does not require correction and simple linear regression analysis is still appropriate.
Non-stationary internal tides observed with satellite altimetry
NASA Astrophysics Data System (ADS)
Ray, R. D.; Zaron, E. D.
2011-09-01
Temporal variability of the internal tide is inferred from a 17-year combined record of Topex/Poseidon and Jason satellite altimeters. A global sampling of along-track sea-surface height wavenumber spectra finds that non-stationary variance is generally 25% or less of the average variance at wavenumbers characteristic of mode-1 tidal internal waves. With some exceptions the non-stationary variance does not exceed 0.25 cm2. The mode-2 signal, where detectable, contains a larger fraction of non-stationary variance, typically 50% or more. Temporal subsetting of the data reveals interannual variability barely significant compared with tidal estimation error from 3-year records. Comparison of summer vs. winter conditions shows only one region of noteworthy seasonal changes, the northern South China Sea. Implications for the anticipated SWOT altimeter mission are briefly discussed.
Non-Stationary Internal Tides Observed with Satellite Altimetry
NASA Technical Reports Server (NTRS)
Ray, Richard D.; Zaron, E. D.
2011-01-01
Temporal variability of the internal tide is inferred from a 17-year combined record of Topex/Poseidon and Jason satellite altimeters. A global sampling of along-track sea-surface height wavenumber spectra finds that non-stationary variance is generally 25% or less of the average variance at wavenumbers characteristic of mode-l tidal internal waves. With some exceptions the non-stationary variance does not exceed 0.25 sq cm. The mode-2 signal, where detectable, contains a larger fraction of non-stationary variance, typically 50% or more. Temporal subsetting of the data reveals interannual variability barely significant compared with tidal estimation error from 3-year records. Comparison of summer vs. winter conditions shows only one region of noteworthy seasonal changes, the northern South China Sea. Implications for the anticipated SWOT altimeter mission are briefly discussed.
Measuring the Power Spectrum with Peculiar Velocities
NASA Astrophysics Data System (ADS)
Macaulay, Edward; Feldman, H. A.; Ferreira, P. G.; Jaffe, A. H.; Agarwal, S.; Hudson, M. J.; Watkins, R.
2012-01-01
The peculiar velocities of galaxies are an inherently valuable cosmological probe, providing an unbiased estimate of the distribution of matter on scales much larger than the depth of the survey. Much research interest has been motivated by the high dipole moment of our local peculiar velocity field, which suggests a large scale excess in the matter power spectrum, and can appear to be in some tension with the LCDM model. We use a composite catalogue of 4,537 peculiar velocity measurements with a characteristic depth of 33 h-1 Mpc to estimate the matter power spectrum. We compare the constraints with this method, directly studying the full peculiar velocity catalogue, to results from Macaulay et al. (2011), studying minimum variance moments of the velocity field, as calculated by Watkins, Feldman & Hudson (2009) and Feldman, Watkins & Hudson (2010). We find good agreement with the LCDM model on scales of k > 0.01 h Mpc-1. We find an excess of power on scales of k < 0.01 h Mpc-1, although with a 1 sigma uncertainty which includes the LCDM model. We find that the uncertainty in the excess at these scales is larger than an alternative result studying only moments of the velocity field, which is due to the minimum variance weights used to calculate the moments. At small scales, we are able to clearly discriminate between linear and nonlinear clustering in simulated peculiar velocity catalogues, and find some evidence (although less clear) for linear clustering in the real peculiar velocity data.
Power spectrum estimation from peculiar velocity catalogues
NASA Astrophysics Data System (ADS)
Macaulay, E.; Feldman, H. A.; Ferreira, P. G.; Jaffe, A. H.; Agarwal, S.; Hudson, M. J.; Watkins, R.
2012-09-01
The peculiar velocities of galaxies are an inherently valuable cosmological probe, providing an unbiased estimate of the distribution of matter on scales much larger than the depth of the survey. Much research interest has been motivated by the high dipole moment of our local peculiar velocity field, which suggests a large-scale excess in the matter power spectrum and can appear to be in some tension with the Λ cold dark matter (ΛCDM) model. We use a composite catalogue of 4537 peculiar velocity measurements with a characteristic depth of 33 h-1 Mpc to estimate the matter power spectrum. We compare the constraints with this method, directly studying the full peculiar velocity catalogue, to results by Macaulay et al., studying minimum variance moments of the velocity field, as calculated by Feldman, Watkins & Hudson. We find good agreement with the ΛCDM model on scales of k > 0.01 h Mpc-1. We find an excess of power on scales of k < 0.01 h Mpc-1 with a 1σ uncertainty which includes the ΛCDM model. We find that the uncertainty in excess at these scales is larger than an alternative result studying only moments of the velocity field, which is due to the minimum variance weights used to calculate the moments. At small scales, we are able to clearly discriminate between linear and non-linear clustering in simulated peculiar velocity catalogues and find some evidence (although less clear) for linear clustering in the real peculiar velocity data.
... to exercise or physical activity. Data source and methods National Health Interview Survey (NHIS) data are collected ... sample design of NHIS. The Taylor series linearization method was used for variance estimation. All estimates shown ...
... were considered to be uninsured. Data source and methods NHIS data were used to estimate the percentage ... sample design of NHIS. The Taylor series linearization method was chosen for variance estimation. Differences between percentages ...
... of center); or g) other.” Data sources and methods This analysis used data from the 2010–2012 ... sample design of NHIS. The Taylor series linearization method was chosen for variance estimation. All estimates shown ...
Source-space ICA for MEG source imaging.
Jonmohamadi, Yaqub; Jones, Richard D
2016-02-01
One of the most widely used approaches in electroencephalography/magnetoencephalography (MEG) source imaging is application of an inverse technique (such as dipole modelling or sLORETA) on the component extracted by independent component analysis (ICA) (sensor-space ICA + inverse technique). The advantage of this approach over an inverse technique alone is that it can identify and localize multiple concurrent sources. Among inverse techniques, the minimum-variance beamformers offer a high spatial resolution. However, in order to have both high spatial resolution of beamformer and be able to take on multiple concurrent sources, sensor-space ICA + beamformer is not an ideal combination. We propose source-space ICA for MEG as a powerful alternative approach which can provide the high spatial resolution of the beamformer and handle multiple concurrent sources. The concept of source-space ICA for MEG is to apply the beamformer first and then singular value decomposition + ICA. In this paper we have compared source-space ICA with sensor-space ICA both in simulation and real MEG. The simulations included two challenging scenarios of correlated/concurrent cluster sources. Source-space ICA provided superior performance in spatial reconstruction of source maps, even though both techniques performed equally from a temporal perspective. Real MEG from two healthy subjects with visual stimuli were also used to compare performance of sensor-space ICA and source-space ICA. We have also proposed a new variant of minimum-variance beamformer called weight-normalized linearly-constrained minimum-variance with orthonormal lead-field. As sensor-space ICA-based source reconstruction is popular in EEG and MEG imaging, and given that source-space ICA has superior spatial performance, it is expected that source-space ICA will supersede its predecessor in many applications.
Switzer, P.; Harden, J.W.; Mark, R.K.
1988-01-01
A statistical method for estimating rates of soil development in a given region based on calibration from a series of dated soils is used to estimate ages of soils in the same region that are not dated directly. The method is designed specifically to account for sampling procedures and uncertainties that are inherent in soil studies. Soil variation and measurement error, uncertainties in calibration dates and their relation to the age of the soil, and the limited number of dated soils are all considered. Maximum likelihood (ML) is employed to estimate a parametric linear calibration curve, relating soil development to time or age on suitably transformed scales. Soil variation on a geomorphic surface of a certain age is characterized by replicate sampling of soils on each surface; such variation is assumed to have a Gaussian distribution. The age of a geomorphic surface is described by older and younger bounds. This technique allows age uncertainty to be characterized by either a Gaussian distribution or by a triangular distribution using minimum, best-estimate, and maximum ages. The calibration curve is taken to be linear after suitable (in certain cases logarithmic) transformations, if required, of the soil parameter and age variables. Soil variability, measurement error, and departures from linearity are described in a combined fashion using Gaussian distributions with variances particular to each sampled geomorphic surface and the number of sample replicates. Uncertainty in age of a geomorphic surface used for calibration is described using three parameters by one of two methods. In the first method, upper and lower ages are specified together with a coverage probability; this specification is converted to a Gaussian distribution with the appropriate mean and variance. In the second method, "absolute" older and younger ages are specified together with a most probable age; this specification is converted to an asymmetric triangular distribution with mode at the most probable age. The statistical variability of the ML-estimated calibration curve is assessed by a Monte Carlo method in which simulated data sets repeatedly are drawn from the distributional specification; calibration parameters are reestimated for each such simulation in order to assess their statistical variability. Several examples are used for illustration. The age of undated soils in a related setting may be estimated from the soil data using the fitted calibration curve. A second simulation to assess age estimate variability is described and applied to the examples. ?? 1988 International Association for Mathematical Geology.
A CLT on the SNR of Diagonally Loaded MVDR Filters
NASA Astrophysics Data System (ADS)
Rubio, Francisco; Mestre, Xavier; Hachem, Walid
2012-08-01
This paper studies the fluctuations of the signal-to-noise ratio (SNR) of minimum variance distorsionless response (MVDR) filters implementing diagonal loading in the estimation of the covariance matrix. Previous results in the signal processing literature are generalized and extended by considering both spatially as well as temporarily correlated samples. Specifically, a central limit theorem (CLT) is established for the fluctuations of the SNR of the diagonally loaded MVDR filter, under both supervised and unsupervised training settings in adaptive filtering applications. Our second-order analysis is based on the Nash-Poincar\\'e inequality and the integration by parts formula for Gaussian functionals, as well as classical tools from statistical asymptotic theory. Numerical evaluations validating the accuracy of the CLT confirm the asymptotic Gaussianity of the fluctuations of the SNR of the MVDR filter.
ERIC Educational Resources Information Center
Oranje, Andreas
2006-01-01
A multitude of methods has been proposed to estimate the sampling variance of ratio estimates in complex samples (Wolter, 1985). Hansen and Tepping (1985) studied some of those variance estimators and found that a high coefficient of variation (CV) of the denominator of a ratio estimate is indicative of a biased estimate of the standard error of a…
Understanding the Degrees of Freedom of Sample Variance by Using Microsoft Excel
ERIC Educational Resources Information Center
Ding, Jian-Hua; Jin, Xian-Wen; Shuai, Ling-Ying
2017-01-01
In this article, the degrees of freedom of the sample variance are simulated by using the Visual Basic for Applications of Microsoft Excel 2010. The simulation file dynamically displays why the sample variance should be calculated by dividing the sum of squared deviations by n-1 rather than n, which is helpful for students to grasp the meaning of…
Measuring kinetics of complex single ion channel data using mean-variance histograms.
Patlak, J B
1993-07-01
The measurement of single ion channel kinetics is difficult when those channels exhibit subconductance events. When the kinetics are fast, and when the current magnitudes are small, as is the case for Na+, Ca2+, and some K+ channels, these difficulties can lead to serious errors in the estimation of channel kinetics. I present here a method, based on the construction and analysis of mean-variance histograms, that can overcome these problems. A mean-variance histogram is constructed by calculating the mean current and the current variance within a brief "window" (a set of N consecutive data samples) superimposed on the digitized raw channel data. Systematic movement of this window over the data produces large numbers of mean-variance pairs which can be assembled into a two-dimensional histogram. Defined current levels (open, closed, or sublevel) appear in such plots as low variance regions. The total number of events in such low variance regions is estimated by curve fitting and plotted as a function of window width. This function decreases with the same time constants as the original dwell time probability distribution for each of the regions. The method can therefore be used: 1) to present a qualitative summary of the single channel data from which the signal-to-noise ratio, open channel noise, steadiness of the baseline, and number of conductance levels can be quickly determined; 2) to quantify the dwell time distribution in each of the levels exhibited. In this paper I present the analysis of a Na+ channel recording that had a number of complexities. The signal-to-noise ratio was only about 8 for the main open state, open channel noise, and fast flickers to other states were present, as were a substantial number of subconductance states. "Standard" half-amplitude threshold analysis of these data produce open and closed time histograms that were well fitted by the sum of two exponentials, but with apparently erroneous time constants, whereas the mean-variance histogram technique provided a more credible analysis of the open, closed, and subconductance times for the patch. I also show that the method produces accurate results on simulated data in a wide variety of conditions, whereas the half-amplitude method, when applied to complex simulated data shows the same errors as were apparent in the real data. The utility and the limitations of this new method are discussed.
Statistical considerations for grain-size analyses of tills
Jacobs, A.M.
1971-01-01
Relative percentages of sand, silt, and clay from samples of the same till unit are not identical because of different lithologies in the source areas, sorting in transport, random variation, and experimental error. Random variation and experimental error can be isolated from the other two as follows. For each particle-size class of each till unit, a standard population is determined by using a normally distributed, representative group of data. New measurements are compared with the standard population and, if they compare satisfactorily, the experimental error is not significant and random variation is within the expected range for the population. The outcome of the comparison depends on numerical criteria derived from a graphical method rather than on a more commonly used one-way analysis of variance with two treatments. If the number of samples and the standard deviation of the standard population are substituted in a t-test equation, a family of hyperbolas is generated, each of which corresponds to a specific number of subsamples taken from each new sample. The axes of the graphs of the hyperbolas are the standard deviation of new measurements (horizontal axis) and the difference between the means of the new measurements and the standard population (vertical axis). The area between the two branches of each hyperbola corresponds to a satisfactory comparison between the new measurements and the standard population. Measurements from a new sample can be tested by plotting their standard deviation vs. difference in means on axes containing a hyperbola corresponding to the specific number of subsamples used. If the point lies between the branches of the hyperbola, the measurements are considered reliable. But if the point lies outside this region, the measurements are repeated. Because the critical segment of the hyperbola is approximately a straight line parallel to the horizontal axis, the test is simplified to a comparison between the means of the standard population and the means of the subsample. The minimum number of subsamples required to prove significant variation between samples caused by different lithologies in the source areas and sorting in transport can be determined directly from the graphical method. The minimum number of subsamples required is the maximum number to be run for economy of effort. ?? 1971 Plenum Publishing Corporation.
New polymorphs of 9-nitro-camptothecin prepared using a supercritical anti-solvent process.
Huang, Yinxia; Wang, Hongdi; Liu, Guijin; Jiang, Yanbin
2015-12-30
Recrystallization and micronization of 9-nitro-camptothecin (9-NC) has been investigated using the supercritical anti-solvent (SAS) technology in this study. Five operating factors, i.e., the type of organic solvent, the concentration of 9-NC in the solution, the flow rate of 9-NC solution, the precipitation pressure and the temperature, were optimized using a selected OA16 (4(5)) orthogonal array design and a series of characterizations were performed for all samples. The results showed that the processed 9-NC particles exhibited smaller particle size and narrower particle size distribution as compared with 9-NC raw material (Form I), and the optimum micronization conditions for preparing 9-NC with minimum particle size were determined by variance analysis, where the solvent plays the most important role in the formation and transformation of polymorphs. Three new polymorphic forms (Form II, III and IV) of 9-NC, which present different physicochemical properties, were generated after the SAS process. The predicted structures of the 9-NC crystals, which were consistent with the experiments, were performed from their experimental XRD data by the direct space approach using the Reflex module of Materials Studio. Meanwhile, the optimal sample (Form III) was proved to have higher cytotoxicity against the cancer cells, which suggested the therapeutic efficacy of 9-NC is polymorph-dependent. Copyright © 2015 Elsevier B.V. All rights reserved.
Arrieta-Bolaños, Esteban; Maldonado-Torres, Hazael; Dimitriu, Oana; Hoddinott, Michael A; Fowles, Finnuala; Shah, Anila; Orlich-Pérez, Priscilla; McWhinnie, Alasdair J; Alfaro-Bourrouet, Wilbert; Buján-Boza, Willem; Little, Ann-Margaret; Salazar-Sánchez, Lizbeth; Madrigal, J Alejandro
2011-01-01
The human leukocyte antigen (HLA) system is the most polymorphic in humans. Its allele, genotype, and haplotype frequencies vary significantly among different populations. Molecular typing data on HLA are necessary for the development of stem cell donor registries, cord blood banks, HLA-disease association studies, and anthropology studies. The Costa Rica Central Valley Population (CCVP) is the major population in this country. No previous study has characterized HLA frequencies in this population. Allele group and haplotype frequencies of HLA genes in the CCVP were determined by means of molecular typing in a sample of 130 unrelated blood donors from one of the country's major hospitals. A comparison between these frequencies and those of 126 populations worldwide was also carried out. A minimum variance dendrogram based on squared Euclidean distances was constructed to assess the relationship between the CCVP sample and populations from all over the world. Allele group and haplotype frequencies observed in this study are consistent with a profile of a dynamic and diverse population, with a hybrid ethnic origin, predominantly Caucasian-Amerindian. Results showed that populations genetically closest to the CCVP are a Mestizo urban population from Venezuela, and another one from Guadalajara, Mexico. Copyright © 2011 American Society for Histocompatibility and Immunogenetics. All rights reserved.
Documentation for the 2003-04 Schools and Staffing Survey. NCES 2007-337
ERIC Educational Resources Information Center
Tourkin, Steven C.; Warner, Toni; Parmer, Randall; Cole, Cornette; Jackson, Betty; Zukerberg, Andrew; Cox, Shawna; Soderberg, Andrew
2007-01-01
This report serves as the survey documentation for the design and implementation of the 2003-04 Schools and Staffing Survey. Topics covered include the sample design, survey methodology, data collection procedures, data processing, response rates, imputation procedures, weighting and variance estimation, review of the quality of data, the types of…
Psychometric Evaluation of Data from the Race-Related Events Scale
ERIC Educational Resources Information Center
Crusto, Cindy A.; Dantzler, John; Roberts, Yvonne Humenay; Hooper, Lisa M.
2015-01-01
Using exploratory factor analysis, we examined the factor structure of data collected from the Race-Related Events Scale, which assesses perceived exposure to race-related stress. Our sample (N = 201) consisted of diverse caregivers of Head Start preschoolers. Three factors explained 81% of the variance in the data and showed sound reliability.
Design of a compensation for an ARMA model of a discrete time system. M.S. Thesis
NASA Technical Reports Server (NTRS)
Mainemer, C. I.
1978-01-01
The design of an optimal dynamic compensator for a multivariable discrete time system is studied. Also the design of compensators to achieve minimum variance control strategies for single input single output systems is analyzed. In the first problem the initial conditions of the plant are random variables with known first and second order moments, and the cost is the expected value of the standard cost, quadratic in the states and controls. The compensator is based on the minimum order Luenberger observer and it is found optimally by minimizing a performance index. Necessary and sufficient conditions for optimality of the compensator are derived. The second problem is solved in three different ways; two of them working directly in the frequency domain and one working in the time domain. The first and second order moments of the initial conditions are irrelevant to the solution. Necessary and sufficient conditions are derived for the compensator to minimize the variance of the output.
Ifoulis, A A; Savopoulou-Soultani, M
2006-10-01
The purpose of this research was to quantify the spatial pattern and develop a sampling program for larvae of Lobesia botrana Denis and Schiffermüller (Lepidoptera: Tortricidae), an important vineyard pest in northern Greece. Taylor's power law and Iwao's patchiness regression were used to model the relationship between the mean and the variance of larval counts. Analysis of covariance was carried out, separately for infestation and injury, with combined second and third generation data, for vine and half-vine sample units. Common regression coefficients were estimated to permit use of the sampling plan over a wide range of conditions. Optimum sample sizes for infestation and injury, at three levels of precision, were developed. An investigation of a multistage sampling plan with a nested analysis of variance showed that if the goal of sampling is focusing on larval infestation, three grape clusters should be sampled in a half-vine; if the goal of sampling is focusing on injury, then two grape clusters per half-vine are recommended.
No-migration variance petition. Appendices C--J: Volume 5, Revision 1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1990-03-01
Volume V contains the appendices for: closure and post-closure plans; RCRA ground water monitoring waver; Waste Isolation Division Quality Program Manual; water quality sampling plan; WIPP Environmental Procedures Manual; sample handling and laboratory procedures; data analysis; and Annual Site Environmental Monitoring Report for the Waste Isolation Pilot Plant.
Kottmann, Renzo; Gray, Tanya; Murphy, Sean; Kagan, Leonid; Kravitz, Saul; Lombardot, Thierry; Field, Dawn; Glöckner, Frank Oliver
2008-06-01
The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that implements the "Minimum Information about a Genome Sequence" (MIGS) specification and its extension, the "Minimum Information about a Metagenome Sequence" (MIMS). GCDML is an XML Schema for generating MIGS/MIMS compliant reports for data entry, exchange, and storage. When mature, this sample-centric, strongly-typed schema will provide a diverse set of descriptors for describing the exact origin and processing of a biological sample, from sampling to sequencing, and subsequent analysis. Here we describe the need for such a project, outline design principles required to support the project, and make an open call for participation in defining the future content of GCDML. GCDML is freely available, and can be downloaded, along with documentation, from the GSC Web site (http://gensc.org).
Manju, Md Abu; Candel, Math J J M; Berger, Martijn P F
2014-07-10
In this paper, the optimal sample sizes at the cluster and person levels for each of two treatment arms are obtained for cluster randomized trials where the cost-effectiveness of treatments on a continuous scale is studied. The optimal sample sizes maximize the efficiency or power for a given budget or minimize the budget for a given efficiency or power. Optimal sample sizes require information on the intra-cluster correlations (ICCs) for effects and costs, the correlations between costs and effects at individual and cluster levels, the ratio of the variance of effects translated into costs to the variance of the costs (the variance ratio), sampling and measuring costs, and the budget. When planning, a study information on the model parameters usually is not available. To overcome this local optimality problem, the current paper also presents maximin sample sizes. The maximin sample sizes turn out to be rather robust against misspecifying the correlation between costs and effects at the cluster and individual levels but may lose much efficiency when misspecifying the variance ratio. The robustness of the maximin sample sizes against misspecifying the ICCs depends on the variance ratio. The maximin sample sizes are robust under misspecification of the ICC for costs for realistic values of the variance ratio greater than one but not robust under misspecification of the ICC for effects. Finally, we show how to calculate optimal or maximin sample sizes that yield sufficient power for a test on the cost-effectiveness of an intervention.
[A Review on the Use of Effect Size in Nursing Research].
Kang, Hyuncheol; Yeon, Kyupil; Han, Sang Tae
2015-10-01
The purpose of this study was to introduce the main concepts of statistical testing and effect size and to provide researchers in nursing science with guidance on how to calculate the effect size for the statistical analysis methods mainly used in nursing. For t-test, analysis of variance, correlation analysis, regression analysis which are used frequently in nursing research, the generally accepted definitions of the effect size were explained. Some formulae for calculating the effect size are described with several examples in nursing research. Furthermore, the authors present the required minimum sample size for each example utilizing G*Power 3 software that is the most widely used program for calculating sample size. It is noted that statistical significance testing and effect size measurement serve different purposes, and the reliance on only one side may be misleading. Some practical guidelines are recommended for combining statistical significance testing and effect size measure in order to make more balanced decisions in quantitative analyses.
McGowan, Ian; Janocko, Laura; Burneisen, Shaun; Bhat, Anand; Richardson-Harman, Nicola
2015-01-01
To determine the intra- and inter-subject variability of mucosal cytokine gene expression in rectal biopsies from healthy volunteers and to screen cytokine and chemokine mRNA as potential biomarkers of mucosal inflammation. Rectal biopsies were collected from 8 participants (3 biopsies per participant) and 1 additional participant (10 biopsies). Quantitative reverse transcription polymerase chain reaction (RT-qPCR) was used to quantify IL-1β, IL-6, IL-12p40, IL-8, IFN-γ, MIP-1α, MIP-1β, RANTES, and TNF-α gene expression in the rectal tissue. The intra-assay, inter-biopsy and inter-subject variance was measured in the eight participants. Bootstrap re-sampling of the biopsy measurements was performed to determine the accuracy of gene expression data obtained for 10 biopsies obtained from one participant. Cytokines were both non-normalized and normalized using four reference genes (GAPDH, β-actin, β2 microglobulin, and CD45). Cytokine measurement accuracy was increased with the number of biopsy samples, per person; four biopsies were typically needed to produce a mean result within a 95% confidence interval of the subject's cytokine level approximately 80% of the time. Intra-assay precision (% geometric standard deviation) ranged between 8.2 and 96.9 with high variance between patients and even between different biopsies from the same patient. Variability was not greatly reduced with the use of reference genes to normalize data. The number of biopsy samples required to provide an accurate result varied by target although 4 biopsy samples per subject and timepoint, provided for >77% accuracy across all targets tested. Biopsies within the same subjects and between subjects had similar levels of variance while variance within a biopsy (intra-assay) was generally lower. Normalization of inflammatory cytokines against reference genes failed to consistently reduce variance. The accuracy and reliability of mRNA expression of inflammatory cytokines will set a ceiling on the ability of these measures to predict mucosal inflammation. Techniques to reduce variability should be developed within a larger cohort of individuals before normative reference values can be validated. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zhang, Bao; Yao, Yibin; Fok, Hok Sum; Hu, Yufeng; Chen, Qiang
2016-09-19
This study uses the observed vertical displacements of Global Positioning System (GPS) time series obtained from the Crustal Movement Observation Network of China (CMONOC) with careful pre- and post-processing to estimate the seasonal crustal deformation in response to the hydrological loading in lower three-rivers headwater region of southwest China, followed by inferring the annual EWH changes through geodetic inversion methods. The Helmert Variance Component Estimation (HVCE) and the Minimum Mean Square Error (MMSE) criterion were successfully employed. The GPS inferred EWH changes agree well qualitatively with the Gravity Recovery and Climate Experiment (GRACE)-inferred and the Global Land Data Assimilation System (GLDAS)-inferred EWH changes, with a discrepancy of 3.2-3.9 cm and 4.8-5.2 cm, respectively. In the research areas, the EWH changes in the Lancang basin is larger than in the other regions, with a maximum of 21.8-24.7 cm and a minimum of 3.1-6.9 cm.
Structural changes and out-of-sample prediction of realized range-based variance in the stock market
NASA Astrophysics Data System (ADS)
Gong, Xu; Lin, Boqiang
2018-03-01
This paper aims to examine the effects of structural changes on forecasting the realized range-based variance in the stock market. Considering structural changes in variance in the stock market, we develop the HAR-RRV-SC model on the basis of the HAR-RRV model. Subsequently, the HAR-RRV and HAR-RRV-SC models are used to forecast the realized range-based variance of S&P 500 Index. We find that there are many structural changes in variance in the U.S. stock market, and the period after the financial crisis contains more structural change points than the period before the financial crisis. The out-of-sample results show that the HAR-RRV-SC model significantly outperforms the HAR-BV model when they are employed to forecast the 1-day, 1-week, and 1-month realized range-based variances, which means that structural changes can improve out-of-sample prediction of realized range-based variance. The out-of-sample results remain robust across the alternative rolling fixed-window, the alternative threshold value in ICSS algorithm, and the alternative benchmark models. More importantly, we believe that considering structural changes can help improve the out-of-sample performances of most of other existing HAR-RRV-type models in addition to the models used in this paper.
Geostatistical modeling of riparian forest microclimate and its implications for sampling
Eskelson, B.N.I.; Anderson, P.D.; Hagar, J.C.; Temesgen, H.
2011-01-01
Predictive models of microclimate under various site conditions in forested headwater stream - riparian areas are poorly developed, and sampling designs for characterizing underlying riparian microclimate gradients are sparse. We used riparian microclimate data collected at eight headwater streams in the Oregon Coast Range to compare ordinary kriging (OK), universal kriging (UK), and kriging with external drift (KED) for point prediction of mean maximum air temperature (Tair). Several topographic and forest structure characteristics were considered as site-specific parameters. Height above stream and distance to stream were the most important covariates in the KED models, which outperformed OK and UK in terms of root mean square error. Sample patterns were optimized based on the kriging variance and the weighted means of shortest distance criterion using the simulated annealing algorithm. The optimized sample patterns outperformed systematic sample patterns in terms of mean kriging variance mainly for small sample sizes. These findings suggest methods for increasing efficiency of microclimate monitoring in riparian areas.
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
ERIC Educational Resources Information Center
Vardeman, Stephen B.; Wendelberger, Joanne R.
2005-01-01
There is a little-known but very simple generalization of the standard result that for uncorrelated random variables with common mean [mu] and variance [sigma][superscript 2], the expected value of the sample variance is [sigma][superscript 2]. The generalization justifies the use of the usual standard error of the sample mean in possibly…
Lahanas, M; Baltas, D; Giannouli, S; Milickovic, N; Zamboglou, N
2000-05-01
We have studied the accuracy of statistical parameters of dose distributions in brachytherapy using actual clinical implants. These include the mean, minimum and maximum dose values and the variance of the dose distribution inside the PTV (planning target volume), and on the surface of the PTV. These properties have been studied as a function of the number of uniformly distributed sampling points. These parameters, or the variants of these parameters, are used directly or indirectly in optimization procedures or for a description of the dose distribution. The accurate determination of these parameters depends on the sampling point distribution from which they have been obtained. Some optimization methods ignore catheters and critical structures surrounded by the PTV or alternatively consider as surface dose points only those on the contour lines of the PTV. D(min) and D(max) are extreme dose values which are either on the PTV surface or within the PTV. They must be avoided for specification and optimization purposes in brachytherapy. Using D(mean) and the variance of D which we have shown to be stable parameters, achieves a more reliable description of the dose distribution on the PTV surface and within the PTV volume than does D(min) and D(max). Generation of dose points on the real surface of the PTV is obligatory and the consideration of catheter volumes results in a realistic description of anatomical dose distributions.
Blinded sample size re-estimation in three-arm trials with 'gold standard' design.
Mütze, Tobias; Friede, Tim
2017-10-15
In this article, we study blinded sample size re-estimation in the 'gold standard' design with internal pilot study for normally distributed outcomes. The 'gold standard' design is a three-arm clinical trial design that includes an active and a placebo control in addition to an experimental treatment. We focus on the absolute margin approach to hypothesis testing in three-arm trials at which the non-inferiority of the experimental treatment and the assay sensitivity are assessed by pairwise comparisons. We compare several blinded sample size re-estimation procedures in a simulation study assessing operating characteristics including power and type I error. We find that sample size re-estimation based on the popular one-sample variance estimator results in overpowered trials. Moreover, sample size re-estimation based on unbiased variance estimators such as the Xing-Ganju variance estimator results in underpowered trials, as it is expected because an overestimation of the variance and thus the sample size is in general required for the re-estimation procedure to eventually meet the target power. To overcome this problem, we propose an inflation factor for the sample size re-estimation with the Xing-Ganju variance estimator and show that this approach results in adequately powered trials. Because of favorable features of the Xing-Ganju variance estimator such as unbiasedness and a distribution independent of the group means, the inflation factor does not depend on the nuisance parameter and, therefore, can be calculated prior to a trial. Moreover, we prove that the sample size re-estimation based on the Xing-Ganju variance estimator does not bias the effect estimate. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Designing occupancy studies: general advice and allocating survey effort
MacKenzie, D.I.; Royle, J. Andrew
2005-01-01
1. The fraction of sampling units in a landscape where a target species is present (occupancy) is an extensively used concept in ecology. Yet in many applications the species will not always be detected in a sampling unit even when present, resulting in biased estimates of occupancy. Given that sampling units are surveyed repeatedly within a relatively short timeframe, a number of similar methods have now been developed to provide unbiased occupancy estimates. However, practical guidance on the efficient design of occupancy studies has been lacking. 2. In this paper we comment on a number of general issues related to designing occupancy studies, including the need for clear objectives that are explicitly linked to science or management, selection of sampling units, timing of repeat surveys and allocation of survey effort. Advice on the number of repeat surveys per sampling unit is considered in terms of the variance of the occupancy estimator, for three possible study designs. 3. We recommend that sampling units should be surveyed a minimum of three times when detection probability is high (> 0.5 survey-1), unless a removal design is used. 4. We found that an optimal removal design will generally be the most efficient, but we suggest it may be less robust to assumption violations than a standard design. 5. Our results suggest that for a rare species it is more efficient to survey more sampling units less intensively, while for a common species fewer sampling units should be surveyed more intensively. 6. Synthesis and applications. Reliable inferences can only result from quality data. To make the best use of logistical resources, study objectives must be clearly defined; sampling units must be selected, and repeated surveys timed appropriately; and a sufficient number of repeated surveys must be conducted. Failure to do so may compromise the integrity of the study. The guidance given here on study design issues is particularly applicable to studies of species occurrence and distribution, habitat selection and modelling, metapopulation studies and monitoring programmes.
Herbicide Orange Site Characterization Study Naval Construction Battalion Center
1987-01-01
U.S. Testing Laboratories for analysis. Over 200 additional analyses were performed for a variety of quality assurance criteria. The resultant data...TABLE 9. NCBC PERFORMANCE AUDIT SAMPLE ANALYSIS SUNMARYa (SERIES 1) TCDD Sppb ) Reported Detection Relative b Sample Number Concentration Limit...limit rather than estimating the variance of the results. The sample results were transformed using the natural logarithm. The Shapiro-Wilk W test
MODFLOW 2000 Head Uncertainty, a First-Order Second Moment Method
Glasgow, H.S.; Fortney, M.D.; Lee, J.; Graettinger, A.J.; Reeves, H.W.
2003-01-01
A computationally efficient method to estimate the variance and covariance in piezometric head results computed through MODFLOW 2000 using a first-order second moment (FOSM) approach is presented. This methodology employs a first-order Taylor series expansion to combine model sensitivity with uncertainty in geologic data. MODFLOW 2000 is used to calculate both the ground water head and the sensitivity of head to changes in input data. From a limited number of samples, geologic data are extrapolated and their associated uncertainties are computed through a conditional probability calculation. Combining the spatially related sensitivity and input uncertainty produces the variance-covariance matrix, the diagonal of which is used to yield the standard deviation in MODFLOW 2000 head. The variance in piezometric head can be used for calibrating the model, estimating confidence intervals, directing exploration, and evaluating the reliability of a design. A case study illustrates the approach, where aquifer transmissivity is the spatially related uncertain geologic input data. The FOSM methodology is shown to be applicable for calculating output uncertainty for (1) spatially related input and output data, and (2) multiple input parameters (transmissivity and recharge).
Further study of terrain effects on the mesoscale spectrum of atmospheric motions
NASA Technical Reports Server (NTRS)
Jasperson, W. H.; Nastrom, G. D.; Fritts, D. C.
1990-01-01
Wind and temperature data collected on commercial airliners are used to investigate the effects of underlying terrain on mesoscale variability. These results expand upon those of Nastrom et al., by including all available data from the Global Atmospheric Sampling Program (GASP) and by more closely focusing on the coupling of variance with the roughness of the underlying terrain over mountainous regions. The earlier results, showing that variances are larger over mountains than over oceans or plains, with greatest increases at wavelengths below about 80 km, are confirmed. Statistical tests are used to confirm that these differences are highly significant. Over mountainous regions the roughness of the underlying terrain was parameterized from topographic data and it was found that variances are highly correlated with roughness and, in the troposphere, with background windspeed. Average variances over the roughest terrain areas range up to about ten times larger than those over the oceans. These results are found to follow the scaling with stability predicted in the framework of linenar gravity wave theory. The implications of these results for vertical transports of momentum and energy, assuming they are due to gravity waves and considering the effects of intermittency and anisotroy, are also discussed.
Seidel, Clemens; Lautenschläger, Christine; Dunst, Jürgen; Müller, Arndt-Christian
2012-04-20
To investigate whether different conditions of DNA structure and radiation treatment could modify heterogeneity of response. Additionally to study variance as a potential parameter of heterogeneity for radiosensitivity testing. Two-hundred leukocytes per sample of healthy donors were split into four groups. I: Intact chromatin structure; II: Nucleoids of histone-depleted DNA; III: Nucleoids of histone-depleted DNA with 90 mM DMSO as antioxidant. Response to single (I-III) and twice (IV) irradiation with 4 Gy and repair kinetics were evaluated using %Tail-DNA. Heterogeneity of DNA damage was determined by calculation of variance of DNA-damage (V) and mean variance (Mvar), mutual comparisons were done by one-way analysis of variance (ANOVA). Heterogeneity of initial DNA-damage (I, 0 min repair) increased without histones (II). Absence of histones was balanced by addition of antioxidants (III). Repair reduced heterogeneity of all samples (with and without irradiation). However double irradiation plus repair led to a higher level of heterogeneity distinguishable from single irradiation and repair in intact cells. Increase of mean DNA damage was associated with a similarly elevated variance of DNA damage (r = +0.88). Heterogeneity of DNA-damage can be modified by histone level, antioxidant concentration, repair and radiation dose and was positively correlated with DNA damage. Experimental conditions might be optimized by reducing scatter of comet assay data by repair and antioxidants, potentially allowing better discrimination of small differences. Amount of heterogeneity measured by variance might be an additional useful parameter to characterize radiosensitivity.
Dynamic association rules for gene expression data analysis.
Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung
2015-10-14
The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed DAR algorithm not only was able to identify a set of differentially expressed genes that largely agreed with that of other methods, but also provided an efficient and accurate way to find influential genes of a disease. In the paper, the well-established association rule mining technique from marketing has been successfully modified to determine the minimum support and minimum confidence based on the concept of confidence interval and hypothesis testing. It can be applied to gene expression data to mine significant association rules between gene regulation and phenotype. The proposed DAR algorithm provides an efficient way to find influential genes that underlie the phenotypic variance.
Li, Yan; Hughes, Jan N.; Kwok, Oi-man; Hsu, Hsien-Yuan
2012-01-01
This study investigated the construct validity of measures of teacher-student support in a sample of 709 ethnically diverse second and third grade academically at-risk students. Confirmatory factor analysis investigated the convergent and discriminant validities of teacher, child, and peer reports of teacher-student support and child conduct problems. Results supported the convergent and discriminant validity of scores on the measures. Peer reports accounted for the largest proportion of trait variance and non-significant method variance. Child reports accounted for the smallest proportion of trait variance and the largest method variance. A model with two latent factors provided a better fit to the data than a model with one factor, providing further evidence of the discriminant validity of measures of teacher-student support. Implications for research, policy, and practice are discussed. PMID:21767024
Code of Federal Regulations, 2014 CFR
2014-04-01
... available upon demand for each day, shift, and drop cycle (this is not required if the system does not track..., beverage containers, etc., into and out of the cage. (j) Variances. The operation must establish, as...
Code of Federal Regulations, 2013 CFR
2013-04-01
... available upon demand for each day, shift, and drop cycle (this is not required if the system does not track..., beverage containers, etc., into and out of the cage. (j) Variances. The operation must establish, as...
Spatio-temporal Reconstruction of Neural Sources Using Indirect Dominant Mode Rejection.
Jafadideh, Alireza Talesh; Asl, Babak Mohammadzadeh
2018-04-27
Adaptive minimum variance based beamformers (MVB) have been successfully applied to magnetoencephalogram (MEG) and electroencephalogram (EEG) data to localize brain activities. However, the performance of these beamformers falls down in situations where correlated or interference sources exist. To overcome this problem, we propose indirect dominant mode rejection (iDMR) beamformer application in brain source localization. This method by modifying measurement covariance matrix makes MVB applicable in source localization in the presence of correlated and interference sources. Numerical results on both EEG and MEG data demonstrate that presented approach accurately reconstructs time courses of active sources and localizes those sources with high spatial resolution. In addition, the results of real AEF data show the good performance of iDMR in empirical situations. Hence, iDMR can be reliably used for brain source localization especially when there are correlated and interference sources.
2005-07-01
as an access graft is addressed using statistical methods below. Graft consistency can be defined statistically as the variance associated with the...addressed using statistical methods below. Graft consistency can be defined statistically as the variance associated with the sample of grafts tested in...measured using a refractometer (Brix % method). The equilibration data are shown in Graph 1. The results suggest the following equilibration scheme: 40% v/v
Methods for presentation and display of multivariate data
NASA Technical Reports Server (NTRS)
Myers, R. H.
1981-01-01
Methods for the presentation and display of multivariate data are discussed with emphasis placed on the multivariate analysis of variance problems and the Hotelling T(2) solution in the two-sample case. The methods utilize the concepts of stepwise discrimination analysis and the computation of partial correlation coefficients.
NASA Astrophysics Data System (ADS)
Yun, Wanying; Lu, Zhenzhou; Jiang, Xian
2018-06-01
To efficiently execute the variance-based global sensitivity analysis, the law of total variance in the successive intervals without overlapping is proved at first, on which an efficient space-partition sampling-based approach is subsequently proposed in this paper. Through partitioning the sample points of output into different subsets according to different inputs, the proposed approach can efficiently evaluate all the main effects concurrently by one group of sample points. In addition, there is no need for optimizing the partition scheme in the proposed approach. The maximum length of subintervals is decreased by increasing the number of sample points of model input variables in the proposed approach, which guarantees the convergence condition of the space-partition approach well. Furthermore, a new interpretation on the thought of partition is illuminated from the perspective of the variance ratio function. Finally, three test examples and one engineering application are employed to demonstrate the accuracy, efficiency and robustness of the proposed approach.
Increasing point-count duration increases standard error
Smith, W.P.; Twedt, D.J.; Hamel, P.B.; Ford, R.P.; Wiedenfeld, D.A.; Cooper, R.J.
1998-01-01
We examined data from point counts of varying duration in bottomland forests of west Tennessee and the Mississippi Alluvial Valley to determine if counting interval influenced sampling efficiency. Estimates of standard error increased as point count duration increased both for cumulative number of individuals and species in both locations. Although point counts appear to yield data with standard errors proportional to means, a square root transformation of the data may stabilize the variance. Using long (>10 min) point counts may reduce sample size and increase sampling error, both of which diminish statistical power and thereby the ability to detect meaningful changes in avian populations.
Global-scale high-resolution ( 1 km) modelling of mean, maximum and minimum annual streamflow
NASA Astrophysics Data System (ADS)
Barbarossa, Valerio; Huijbregts, Mark; Hendriks, Jan; Beusen, Arthur; Clavreul, Julie; King, Henry; Schipper, Aafke
2017-04-01
Quantifying mean, maximum and minimum annual flow (AF) of rivers at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. AF metrics can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict AF metrics based on climate and catchment characteristics. Yet, so far, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. We developed global-scale regression models that quantify mean, maximum and minimum AF as function of catchment area and catchment-averaged slope, elevation, and mean, maximum and minimum annual precipitation and air temperature. We then used these models to obtain global 30 arc-seconds (˜ 1 km) maps of mean, maximum and minimum AF for each year from 1960 through 2015, based on a newly developed hydrologically conditioned digital elevation model. We calibrated our regression models based on observations of discharge and catchment characteristics from about 4,000 catchments worldwide, ranging from 100 to 106 km2 in size, and validated them against independent measurements as well as the output of a number of process-based global hydrological models (GHMs). The variance explained by our regression models ranged up to 90% and the performance of the models compared well with the performance of existing GHMs. Yet, our AF maps provide a level of spatial detail that cannot yet be achieved by current GHMs.
Gorban, Alexander N; Pokidysheva, Lyudmila I; Smirnova, Elena V; Tyukina, Tatiana A
2011-09-01
The "Law of the Minimum" states that growth is controlled by the scarcest resource (limiting factor). This concept was originally applied to plant or crop growth (Justus von Liebig, 1840, Salisbury, Plant physiology, 4th edn., Wadsworth, Belmont, 1992) and quantitatively supported by many experiments. Some generalizations based on more complicated "dose-response" curves were proposed. Violations of this law in natural and experimental ecosystems were also reported. We study models of adaptation in ensembles of similar organisms under load of environmental factors and prove that violation of Liebig's law follows from adaptation effects. If the fitness of an organism in a fixed environment satisfies the Law of the Minimum then adaptation equalizes the pressure of essential factors and, therefore, acts against the Liebig's law. This is the the Law of the Minimum paradox: if for a randomly chosen pair "organism-environment" the Law of the Minimum typically holds, then in a well-adapted system, we have to expect violations of this law.For the opposite interaction of factors (a synergistic system of factors which amplify each other), adaptation leads from factor equivalence to limitations by a smaller number of factors.For analysis of adaptation, we develop a system of models based on Selye's idea of the universal adaptation resource (adaptation energy). These models predict that under the load of an environmental factor a population separates into two groups (phases): a less correlated, well adapted group and a highly correlated group with a larger variance of attributes, which experiences problems with adaptation. Some empirical data are presented and evidences of interdisciplinary applications to econometrics are discussed. © Society for Mathematical Biology 2010
Integrating mean and variance heterogeneities to identify differentially expressed genes.
Ouyang, Weiwei; An, Qiang; Zhao, Jinying; Qin, Huaizhen
2016-12-06
In functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is simply neglected or calibrated for as an impediment. The mean heterogeneity in the expression level of a gene reflects one aspect of its distribution alteration; and variance heterogeneity induced by condition change may reflect another aspect. Change in condition may alter both mean and some higher-order characteristics of the distributions of expression levels of susceptible genes. In this report, we put forth a conception of mean-variance differentially expressed (MVDE) genes, whose expression means and variances are sensitive to the change in experimental condition. We mathematically proved the null independence of existent mean heterogeneity tests and variance heterogeneity tests. Based on the independence, we proposed an integrative mean-variance test (IMVT) to combine gene-wise mean heterogeneity and variance heterogeneity induced by condition change. The IMVT outperformed its competitors under comprehensive simulations of normality and Laplace settings. For moderate samples, the IMVT well controlled type I error rates, and so did existent mean heterogeneity test (i.e., the Welch t test (WT), the moderated Welch t test (MWT)) and the procedure of separate tests on mean and variance heterogeneities (SMVT), but the likelihood ratio test (LRT) severely inflated type I error rates. In presence of variance heterogeneity, the IMVT appeared noticeably more powerful than all the valid mean heterogeneity tests. Application to the gene profiles of peripheral circulating B raised solid evidence of informative variance heterogeneity. After adjusting for background data structure, the IMVT replicated previous discoveries and identified novel experiment-wide significant MVDE genes. Our results indicate tremendous potential gain of integrating informative variance heterogeneity after adjusting for global confounders and background data structure. The proposed informative integration test better summarizes the impacts of condition change on expression distributions of susceptible genes than do the existent competitors. Therefore, particular attention should be paid to explicitly exploit the variance heterogeneity induced by condition change in functional genomics analysis.
Sampling and Data Analysis for Environmental Microbiology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murray, Christopher J.
2001-06-01
A brief review of the literature indicates the importance of statistical analysis in applied and environmental microbiology. Sampling designs are particularly important for successful studies, and it is highly recommended that researchers review their sampling design before heading to the laboratory or the field. Most statisticians have numerous stories of scientists who approached them after their study was complete only to have to tell them that the data they gathered could not be used to test the hypothesis they wanted to address. Once the data are gathered, a large and complex body of statistical techniques are available for analysis ofmore » the data. Those methods include both numerical and graphical techniques for exploratory characterization of the data. Hypothesis testing and analysis of variance (ANOVA) are techniques that can be used to compare the mean and variance of two or more groups of samples. Regression can be used to examine the relationships between sets of variables and is often used to examine the dependence of microbiological populations on microbiological parameters. Multivariate statistics provides several methods that can be used for interpretation of datasets with a large number of variables and to partition samples into similar groups, a task that is very common in taxonomy, but also has applications in other fields of microbiology. Geostatistics and other techniques have been used to examine the spatial distribution of microorganisms. The objectives of this chapter are to provide a brief survey of some of the statistical techniques that can be used for sample design and data analysis of microbiological data in environmental studies, and to provide some examples of their use from the literature.« less
Zheng, Li; Silliman, Stephen E.
2000-01-01
A modification of previously published solutions regarding the spatial variation of hydraulic heads is discussed whereby the semivariogram of increments of head residuals (termed head residual increments HRIs) are related to the variance and integral scale of the transmissivity field. A first‐order solution is developed for the case of a transmissivity field which is isotropic and whose second‐order behavior can be characterized by an exponential covariance structure. The estimates of the variance σY2 and the integral scale λ of the log transmissivity field are then obtained via fitting a theoretical semivariogram for the HRI to its sample semivariogram. This approach is applied to head data sampled from a series of two‐dimensional, simulated aquifers with isotropic, exponential covariance structures and varying degrees of heterogeneity (σY2 = 0.25, 0.5, 1.0, 2.0, and 5.0). The results show that this method provided reliable estimates for both λ and σY2 in aquifers with the value of σY2 up to 2.0, but the errors in those estimates were higher for σY2 equal to 5.0. It is also demonstrated through numerical experiments and theoretical arguments that the head residual increments will provide a sample semivariogram with a lower variance than will the use of the head residuals without calculation of increments.
Anthropometry as a predictor of vertical jump heights derived from an instrumented platform.
Caruso, John F; Daily, Jeremy S; Mason, Melissa L; Shepherd, Catherine M; McLagan, Jessica R; Marshall, Mallory R; Walker, Ron H; West, Jason O
2012-01-01
The current study purpose examined the vertical height-anthropometry relationship with jump data obtained from an instrumented platform. Our methods required college-aged (n = 177) subjects to make 3 visits to our laboratory to measure the following anthropometric variables: height, body mass, upper arm length (UAL), lower arm length, upper leg length, and lower leg length. Per jump, maximum height was measured in 3 ways: from the subjects' takeoff, hang times, and as they landed on the platform. Standard multivariate regression assessed how well anthropometry predicted the criterion variance per gender (men, women, pooled) and jump height method (takeoff, hang time, landing) combination. Z-scores indicated that small amounts of the total data were outliers. The results showed that the majority of outliers were from jump heights calculated as women landed on the platform. With the genders pooled, anthropometry predicted a significant (p < 0.05) amount of variance from jump heights calculated from both takeoff and hang time. The anthropometry-vertical jump relationship was not significant from heights calculated as subjects landed on the platform, likely due to the female outliers. Yet anthropometric data of men did predict a significant amount of variance from heights calculated when they landed on the platform; univariate correlations of men's data revealed that UAL was the best predictor. It was concluded that the large sample of men's data led to greater data heterogeneity and a higher univariate correlation. Because of our sample size and data heterogeneity, practical applications suggest that coaches may find our results best predict performance for a variety of college-aged athletes and vertical jump enthusiasts.
Bassani, Diego G; Corsi, Daniel J; Gaffey, Michelle F; Barros, Aluisio J D
2014-01-01
Worse health outcomes including higher morbidity and mortality are most often observed among the poorest fractions of a population. In this paper we present and validate national, regional and state-level distributions of national wealth index scores, for urban and rural populations, derived from household asset data collected in six survey rounds in India between 1992-3 and 2007-8. These new indices and their sub-national distributions allow for comparative analyses of a standardized measure of wealth across time and at various levels of population aggregation in India. Indices were derived through principal components analysis (PCA) performed using standardized variables from a correlation matrix to minimize differences in variance. Valid and simple indices were constructed with the minimum number of assets needed to produce scores with enough variability to allow definition of unique decile cut-off points in each urban and rural area of all states. For all indices, the first PCA components explained between 36% and 43% of the variance in household assets. Using sub-national distributions of national wealth index scores, mean height-for-age z-scores increased from the poorest to the richest wealth quintiles for all surveys, and stunting prevalence was higher among the poorest and lower among the wealthiest. Urban and rural decile cut-off values for India, for the six regions and for the 24 major states revealed large variability in wealth by geographical area and level, and rural wealth score gaps exceeded those observed in urban areas. The large variability in sub-national distributions of national wealth index scores indicates the importance of accounting for such variation when constructing wealth indices and deriving score distribution cut-off points. Such an approach allows for proper within-sample economic classification, resulting in scores that are valid indicators of wealth and correlate well with health outcomes, and enables wealth-related analyses at whichever geographical area and level may be most informative for policy-making processes.
Noise and analyzer-crystal angular position analysis for analyzer-based phase-contrast imaging
NASA Astrophysics Data System (ADS)
Majidi, Keivan; Li, Jun; Muehleman, Carol; Brankov, Jovan G.
2014-04-01
The analyzer-based phase-contrast x-ray imaging (ABI) method is emerging as a potential alternative to conventional radiography. Like many of the modern imaging techniques, ABI is a computed imaging method (meaning that images are calculated from raw data). ABI can simultaneously generate a number of planar parametric images containing information about absorption, refraction, and scattering properties of an object. These images are estimated from raw data acquired by measuring (sampling) the angular intensity profile of the x-ray beam passed through the object at different angular positions of the analyzer crystal. The noise in the estimated ABI parametric images depends upon imaging conditions like the source intensity (flux), measurements angular positions, object properties, and the estimation method. In this paper, we use the Cramér-Rao lower bound (CRLB) to quantify the noise properties in parametric images and to investigate the effect of source intensity, different analyzer-crystal angular positions and object properties on this bound, assuming a fixed radiation dose delivered to an object. The CRLB is the minimum bound for the variance of an unbiased estimator and defines the best noise performance that one can obtain regardless of which estimation method is used to estimate ABI parametric images. The main result of this paper is that the variance (hence the noise) in parametric images is directly proportional to the source intensity and only a limited number of analyzer-crystal angular measurements (eleven for uniform and three for optimal non-uniform) are required to get the best parametric images. The following angular measurements only spread the total dose to the measurements without improving or worsening CRLB, but the added measurements may improve parametric images by reducing estimation bias. Next, using CRLB we evaluate the multiple-image radiography, diffraction enhanced imaging and scatter diffraction enhanced imaging estimation techniques, though the proposed methodology can be used to evaluate any other ABI parametric image estimation technique.
Noise and Analyzer-Crystal Angular Position Analysis for Analyzer-Based Phase-Contrast Imaging
Majidi, Keivan; Li, Jun; Muehleman, Carol; Brankov, Jovan G.
2014-01-01
The analyzer-based phase-contrast X-ray imaging (ABI) method is emerging as a potential alternative to conventional radiography. Like many of the modern imaging techniques, ABI is a computed imaging method (meaning that images are calculated from raw data). ABI can simultaneously generate a number of planar parametric images containing information about absorption, refraction, and scattering properties of an object. These images are estimated from raw data acquired by measuring (sampling) the angular intensity profile (AIP) of the X-ray beam passed through the object at different angular positions of the analyzer crystal. The noise in the estimated ABI parametric images depends upon imaging conditions like the source intensity (flux), measurements angular positions, object properties, and the estimation method. In this paper, we use the Cramér-Rao lower bound (CRLB) to quantify the noise properties in parametric images and to investigate the effect of source intensity, different analyzer-crystal angular positions and object properties on this bound, assuming a fixed radiation dose delivered to an object. The CRLB is the minimum bound for the variance of an unbiased estimator and defines the best noise performance that one can obtain regardless of which estimation method is used to estimate ABI parametric images. The main result of this manuscript is that the variance (hence the noise) in parametric images is directly proportional to the source intensity and only a limited number of analyzer-crystal angular measurements (eleven for uniform and three for optimal non-uniform) are required to get the best parametric images. The following angular measurements only spread the total dose to the measurements without improving or worsening CRLB, but the added measurements may improve parametric images by reducing estimation bias. Next, using CRLB we evaluate the Multiple-Image Radiography (MIR), Diffraction Enhanced Imaging (DEI) and Scatter Diffraction Enhanced Imaging (S-DEI) estimation techniques, though the proposed methodology can be used to evaluate any other ABI parametric image estimation technique. PMID:24651402
McGarvey, Richard; Burch, Paul; Matthews, Janet M
2016-01-01
Natural populations of plants and animals spatially cluster because (1) suitable habitat is patchy, and (2) within suitable habitat, individuals aggregate further into clusters of higher density. We compare the precision of random and systematic field sampling survey designs under these two processes of species clustering. Second, we evaluate the performance of 13 estimators for the variance of the sample mean from a systematic survey. Replicated simulated surveys, as counts from 100 transects, allocated either randomly or systematically within the study region, were used to estimate population density in six spatial point populations including habitat patches and Matérn circular clustered aggregations of organisms, together and in combination. The standard one-start aligned systematic survey design, a uniform 10 x 10 grid of transects, was much more precise. Variances of the 10 000 replicated systematic survey mean densities were one-third to one-fifth of those from randomly allocated transects, implying transect sample sizes giving equivalent precision by random survey would need to be three to five times larger. Organisms being restricted to patches of habitat was alone sufficient to yield this precision advantage for the systematic design. But this improved precision for systematic sampling in clustered populations is underestimated by standard variance estimators used to compute confidence intervals. True variance for the survey sample mean was computed from the variance of 10 000 simulated survey mean estimates. Testing 10 published and three newly proposed variance estimators, the two variance estimators (v) that corrected for inter-transect correlation (ν₈ and ν(W)) were the most accurate and also the most precise in clustered populations. These greatly outperformed the two "post-stratification" variance estimators (ν₂ and ν₃) that are now more commonly applied in systematic surveys. Similar variance estimator performance rankings were found with a second differently generated set of spatial point populations, ν₈ and ν(W) again being the best performers in the longer-range autocorrelated populations. However, no systematic variance estimators tested were free from bias. On balance, systematic designs bring more narrow confidence intervals in clustered populations, while random designs permit unbiased estimates of (often wider) confidence interval. The search continues for better estimators of sampling variance for the systematic survey mean.
Seguchi, Noriko; Quintyn, Conrad B; Yonemoto, Shiori; Takamuku, Hirofumi
2017-09-10
We explore variations in body and limb proportions of the Jomon hunter-gatherers (14,000-2500 BP), the Yayoi agriculturalists (2500-1700 BP) of Japan, and the Kumejima Islanders of the Ryukyus (1600-1800 AD) with 11 geographically diverse skeletal postcranial samples from Africa, Europe, Asia, Australia, and North America using brachial-crural indices, femur head-breadth-to-femur length ratio, femur head-breadth-to-lower-limb-length ratio, and body mass as indicators of phenotypic climatic adaptation. Specifically, we test the hypothesis that variation in limb proportions seen in Jomon, Yayoi, and Kumejima is a complex interaction of genetic adaptation; development and allometric constraints; selection, gene flow and genetic drift with changing cultural factors (i.e., nutrition) and climate. The skeletal data (1127 individuals) were subjected to principle components analysis, Manly's permutation multiple regression tests, and Relethford-Blangero analysis. The results of Manly's tests indicate that body proportions and body mass are significantly correlated with latitude, and minimum and maximum temperatures while limb proportions were not significantly correlated with these climatic variables. Principal components plots separated "climatic zones:" tropical, temperate, and arctic populations. The indigenous Jomon showed cold-adapted body proportions and warm-adapted limb proportions. Kumejima showed cold-adapted body proportions and limbs. The Yayoi adhered to the Allen-Bergmann expectation of cold-adapted body and limb proportions. Relethford-Blangero analysis showed that Kumejima experienced gene flow indicated by high observed variances while Jomon experienced genetic drift indicated by low observed variances. The complex interaction of evolutionary forces and development/nutritional constraints are implicated in the mismatch of limb and body proportions. © 2017 Wiley Periodicals, Inc.
Electron Heat Flux in Pressure Balance Structures at Ulysses
NASA Technical Reports Server (NTRS)
Yamauchi, Yohei; Suess, Steven T.; Sakurai, Takashi; Whitaker, Ann F. (Technical Monitor)
2001-01-01
Pressure balance structures (PBSs) are a common feature in the high-latitude solar wind near solar minimum. Rom previous studies, PBSs are believed to be remnants of coronal plumes and be related to network activity such as magnetic reconnection in the photosphere. We investigated the magnetic structures of the PBSs, applying a minimum variance analysis to Ulysses/Magnetometer data. At 2001 AGU Spring meeting, we reported that PBSs have structures like current sheets or plasmoids, and suggested that they are associated with network activity at the base of polar plumes. In this paper, we have analyzed high-energy electron data at Ulysses/SWOOPS to see whether bi-directional electron flow exists and confirm the conclusions more precisely. As a result, although most events show a typical flux directed away from the Sun, we have obtained evidence that some PBSs show bi-directional electron flux and others show an isotropic distribution of electron pitch angles. The evidence shows that plasmoids are flowing away from the Sun, changing their flow direction dynamically in a way not caused by Alfven waves. From this, we have concluded that PBSs are generated due to network activity at the base of polar plumes and their magnetic structures axe current sheets or plasmoids.
Trends in Allergic Conditions among Children: United States, 1997-2011
... and imputed family income ( 13 ). Data source and methods Prevalence estimates for allergic conditions were obtained from ... sample design of NHIS. The Taylor series linearization method was chosen for variance estimation. Differences between percentages ...
Dependability of Data Derived from Time Sampling Methods with Multiple Observation Targets
ERIC Educational Resources Information Center
Johnson, Austin H.; Chafouleas, Sandra M.; Briesch, Amy M.
2017-01-01
In this study, generalizability theory was used to examine the extent to which (a) time-sampling methodology, (b) number of simultaneous behavior targets, and (c) individual raters influenced variance in ratings of academic engagement for an elementary-aged student. Ten graduate-student raters, with an average of 7.20 hr of previous training in…
Appropriate Statistical Analysis for Two Independent Groups of Likert-Type Data
ERIC Educational Resources Information Center
Warachan, Boonyasit
2011-01-01
The objective of this research was to determine the robustness and statistical power of three different methods for testing the hypothesis that ordinal samples of five and seven Likert categories come from equal populations. The three methods are the two sample t-test with equal variances, the Mann-Whitney test, and the Kolmogorov-Smirnov test. In…
Harris, Alexandre M.; DeGiorgio, Michael
2016-01-01
Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator’s variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, H∼BLUE, relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of H∼BLUE on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of H∼BLUE leads to improved estimates of the population differentiation statistic, FST, which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data. PMID:28040781
A Review on Sensor, Signal, and Information Processing Algorithms (PREPRINT)
2010-01-01
processing [214], ambi- guity surface averaging [215], optimum uncertain field tracking, and optimal minimum variance track - before - detect [216]. In [217, 218...2) (2001) 739–746. [216] S. L. Tantum, L. W. Nolte, J. L. Krolik, K. Harmanci, The performance of matched-field track - before - detect methods using
A Comparison of Item Selection Techniques for Testlets
ERIC Educational Resources Information Center
Murphy, Daniel L.; Dodd, Barbara G.; Vaughn, Brandon K.
2010-01-01
This study examined the performance of the maximum Fisher's information, the maximum posterior weighted information, and the minimum expected posterior variance methods for selecting items in a computerized adaptive testing system when the items were grouped in testlets. A simulation study compared the efficiency of ability estimation among the…
African-American adolescents’ stress responses after the 9/11/01 terrorist attacks
Barnes, Vernon A.; Treiber, Frank A.; Ludwig, David A.
2012-01-01
Purpose To examine the impact of indirect exposure to the 9/11/01 attacks upon physical and emotional stress-related responses in a community sample of African-American (AA) adolescents. Methods Three months after the 9/11/01 terrorist attacks, 406 AA adolescents (mean age [SD] of 16.1 ± 1.3 years) from an inner-city high school in Augusta, GA were evaluated with a 12-item 5-point Likert scale measuring loss of psychosocial resources (PRS) such as control, hope, optimism, and perceived support, a 17-item 5-point Likert scale measuring post-traumatic stress symptomatology (PCL), and measures of state and trait anger, anger expression, and hostility. Given the observational nature of the study, statistical differences and correlations were evaluated for effect size before statistical testing (5% minimum variance explained). Bootstrapping was used for testing mean differences and differences between correlations. Results PCL scores indicated that approximately 10% of the sample was experiencing probable clinically significant levels of post-traumatic distress (PCL score > 50). The PCL and PRS were moderately correlated with a r = .59. Gender differences for the PCL and PRS were small, accounting for 1% of the total variance. Higher PCL scores were associated with higher state anger (r = .47), as well as measures of anger-out (r = .32) and trait anger (r = .34). Higher PRS scores were associated only with higher state anger (r = .27). Scores on the two 9/11/01-related scales were not statistically associated (i.e., less than 5% of the variance explained) with traits of anger control, anger-in, or hostility. Conclusions The majority of students were not overly stressed by indirect exposure to the events of 9/11/01, perhaps owing to the temporal, social, and/or geographical distance from the event. Those who reported greater negative impact appeared to also be experiencing higher levels of current anger and exhibited a characterologic style of higher overt anger expression. PMID:15737775
Husby, Arild; Gustafsson, Lars; Qvarnström, Anna
2012-01-01
The avian incubation period is associated with high energetic costs and mortality risks suggesting that there should be strong selection to reduce the duration to the minimum required for normal offspring development. Although there is much variation in the duration of the incubation period across species, there is also variation within species. It is necessary to estimate to what extent this variation is genetically determined if we want to predict the evolutionary potential of this trait. Here we use a long-term study of collared flycatchers to examine the genetic basis of variation in incubation duration. We demonstrate limited genetic variance as reflected in the low and nonsignificant additive genetic variance, with a corresponding heritability of 0.04 and coefficient of additive genetic variance of 2.16. Any selection acting on incubation duration will therefore be inefficient. To our knowledge, this is the first time heritability of incubation duration has been estimated in a natural bird population. © 2011 by The University of Chicago.
USDA-ARS?s Scientific Manuscript database
We proposed a method to estimate the error variance among non-replicated genotypes, thus to estimate the genetic parameters by using replicated controls. We derived formulas to estimate sampling variances of the genetic parameters. Computer simulation indicated that the proposed methods of estimatin...
Soave, David; Sun, Lei
2017-09-01
We generalize Levene's test for variance (scale) heterogeneity between k groups for more complex data, when there are sample correlation and group membership uncertainty. Following a two-stage regression framework, we show that least absolute deviation regression must be used in the stage 1 analysis to ensure a correct asymptotic χk-12/(k-1) distribution of the generalized scale (gS) test statistic. We then show that the proposed gS test is independent of the generalized location test, under the joint null hypothesis of no mean and no variance heterogeneity. Consequently, we generalize the recently proposed joint location-scale (gJLS) test, valuable in settings where there is an interaction effect but one interacting variable is not available. We evaluate the proposed method via an extensive simulation study and two genetic association application studies. © 2017 The Authors Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.
Ozone data and mission sampling analysis
NASA Technical Reports Server (NTRS)
Robbins, J. L.
1980-01-01
A methodology was developed to analyze discrete data obtained from the global distribution of ozone. Statistical analysis techniques were applied to describe the distribution of data variance in terms of empirical orthogonal functions and components of spherical harmonic models. The effects of uneven data distribution and missing data were considered. Data fill based on the autocorrelation structure of the data is described. Computer coding of the analysis techniques is included.
Khandoker, Ahsan H; Karmakar, Chandan K; Begg, Rezaul K; Palaniswami, Marimuthu
2007-01-01
As humans age or are influenced by pathology of the neuromuscular system, gait patterns are known to adjust, accommodating for reduced function in the balance control system. The aim of this study was to investigate the effectiveness of a wavelet based multiscale analysis of a gait variable [minimum toe clearance (MTC)] in deriving indexes for understanding age-related declines in gait performance and screening of balance impairments in the elderly. MTC during walking on a treadmill for 30 healthy young, 27 healthy elderly and 10 falls risk elderly subjects with a history of tripping falls were analyzed. The MTC signal from each subject was decomposed to eight detailed signals at different wavelet scales by using the discrete wavelet transform. The variances of detailed signals at scales 8 to 1 were calculated. The multiscale exponent (beta) was then estimated from the slope of the variance progression at successive scales. The variance at scale 5 was significantly (p<0.01) different between young and healthy elderly group. Results also suggest that the Beta between scales 1 to 2 are effective for recognizing falls risk gait patterns. Results have implication for quantifying gait dynamics in normal, ageing and pathological conditions. Early detection of gait pattern changes due to ageing and balance impairments using wavelet-based multiscale analysis might provide the opportunity to initiate preemptive measures to be undertaken to avoid injurious falls.
NASA Astrophysics Data System (ADS)
Lehmkuhl, John F.
1984-03-01
The concept of minimum populations of wildlife and plants has only recently been discussed in the literature. Population genetics has emerged as a basic underlying criterion for determining minimum population size. This paper presents a genetic framework and procedure for determining minimum viable population size and dispersion strategies in the context of multiple-use land management planning. A procedure is presented for determining minimum population size based on maintenance of genetic heterozygosity and reduction of inbreeding. A minimum effective population size ( N e ) of 50 breeding animals is taken from the literature as the minimum shortterm size to keep inbreeding below 1% per generation. Steps in the procedure adjust N e to account for variance in progeny number, unequal sex ratios, overlapping generations, population fluctuations, and period of habitat/population constraint. The result is an approximate census number that falls within a range of effective population size of 50 500 individuals. This population range defines the time range of short- to long-term population fitness and evolutionary potential. The length of the term is a relative function of the species generation time. Two population dispersion strategies are proposed: core population and dispersed population.
ERIC Educational Resources Information Center
Hofer, Scott M.; Flaherty, Brian P.; Hoffman, Lesa
2006-01-01
The effect of time-related mean differences on estimates of association in cross-sectional studies has not been widely recognized in developmental and aging research. Cross-sectional studies of samples varying in age have found moderate to high levels of shared age-related variance among diverse age-related measures. These findings may be…
Accident prediction model for public highway-rail grade crossings.
Lu, Pan; Tolliver, Denver
2016-05-01
Considerable research has focused on roadway accident frequency analysis, but relatively little research has examined safety evaluation at highway-rail grade crossings. Highway-rail grade crossings are critical spatial locations of utmost importance for transportation safety because traffic crashes at highway-rail grade crossings are often catastrophic with serious consequences. The Poisson regression model has been employed to analyze vehicle accident frequency as a good starting point for many years. The most commonly applied variations of Poisson including negative binomial, and zero-inflated Poisson. These models are used to deal with common crash data issues such as over-dispersion (sample variance is larger than the sample mean) and preponderance of zeros (low sample mean and small sample size). On rare occasions traffic crash data have been shown to be under-dispersed (sample variance is smaller than the sample mean) and traditional distributions such as Poisson or negative binomial cannot handle under-dispersion well. The objective of this study is to investigate and compare various alternate highway-rail grade crossing accident frequency models that can handle the under-dispersion issue. The contributions of the paper are two-fold: (1) application of probability models to deal with under-dispersion issues and (2) obtain insights regarding to vehicle crashes at public highway-rail grade crossings. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Chhikara, R. S.; Perry, C. R., Jr. (Principal Investigator)
1980-01-01
The problem of determining the stratum variances required for an optimum sample allocation for remotely sensed crop surveys is investigated with emphasis on an approach based on the concept of stratum variance as a function of the sampling unit size. A methodology using the existing and easily available information of historical statistics is developed for obtaining initial estimates of stratum variances. The procedure is applied to variance for wheat in the U.S. Great Plains and is evaluated based on the numerical results obtained. It is shown that the proposed technique is viable and performs satisfactorily with the use of a conservative value (smaller than the expected value) for the field size and with the use of crop statistics from the small political division level.
Assessment of metabolic phenotypic variability in children’s urine using 1H NMR spectroscopy
NASA Astrophysics Data System (ADS)
Maitre, Léa; Lau, Chung-Ho E.; Vizcaino, Esther; Robinson, Oliver; Casas, Maribel; Siskos, Alexandros P.; Want, Elizabeth J.; Athersuch, Toby; Slama, Remy; Vrijheid, Martine; Keun, Hector C.; Coen, Muireann
2017-04-01
The application of metabolic phenotyping in clinical and epidemiological studies is limited by a poor understanding of inter-individual, intra-individual and temporal variability in metabolic phenotypes. Using 1H NMR spectroscopy we characterised short-term variability in urinary metabolites measured from 20 children aged 8-9 years old. Daily spot morning, night-time and pooled (50:50 morning and night-time) urine samples across six days (18 samples per child) were analysed, and 44 metabolites quantified. Intraclass correlation coefficients (ICC) and mixed effect models were applied to assess the reproducibility and biological variance of metabolic phenotypes. Excellent analytical reproducibility and precision was demonstrated for the 1H NMR spectroscopic platform (median CV 7.2%). Pooled samples captured the best inter-individual variability with an ICC of 0.40 (median). Trimethylamine, N-acetyl neuraminic acid, 3-hydroxyisobutyrate, 3-hydroxybutyrate/3-aminoisobutyrate, tyrosine, valine and 3-hydroxyisovalerate exhibited the highest stability with over 50% of variance specific to the child. The pooled sample was shown to capture the most inter-individual variance in the metabolic phenotype, which is of importance for molecular epidemiology study design. A substantial proportion of the variation in the urinary metabolome of children is specific to the individual, underlining the potential of such data to inform clinical and exposome studies conducted early in life.
Intra-class correlation estimates for assessment of vitamin A intake in children.
Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D
2005-03-01
In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.
Image correlation and sampling study
NASA Technical Reports Server (NTRS)
Popp, D. J.; Mccormack, D. S.; Sedwick, J. L.
1972-01-01
The development of analytical approaches for solving image correlation and image sampling of multispectral data is discussed. Relevant multispectral image statistics which are applicable to image correlation and sampling are identified. The general image statistics include intensity mean, variance, amplitude histogram, power spectral density function, and autocorrelation function. The translation problem associated with digital image registration and the analytical means for comparing commonly used correlation techniques are considered. General expressions for determining the reconstruction error for specific image sampling strategies are developed.
New Trends in Gender and Mathematics Performance: A Meta-Analysis
Lindberg, Sara M.; Hyde, Janet Shibley; Petersen, Jennifer L.; Linn, Marcia C.
2010-01-01
In this paper, we use meta-analysis to analyze gender differences in recent studies of mathematics performance. First, we meta-analyzed data from 242 studies published between 1990 and 2007, representing the testing of 1,286,350 people. Overall, d = .05, indicating no gender difference, and VR = 1.08, indicating nearly equal male and female variances. Second, we analyzed data from large data sets based on probability sampling of U.S. adolescents over the past 20 years: the NLSY, NELS88, LSAY, and NAEP. Effect sizes for the gender difference ranged between −0.15 and +0.22. Variance ratios ranged from 0.88 to 1.34. Taken together these findings support the view that males and females perform similarly in mathematics. PMID:21038941
Wang, Xuefeng; Lee, Seunggeun; Zhu, Xiaofeng; Redline, Susan; Lin, Xihong
2013-12-01
Family-based genetic association studies of related individuals provide opportunities to detect genetic variants that complement studies of unrelated individuals. Most statistical methods for family association studies for common variants are single marker based, which test one SNP a time. In this paper, we consider testing the effect of an SNP set, e.g., SNPs in a gene, in family studies, for both continuous and discrete traits. Specifically, we propose a generalized estimating equations (GEEs) based kernel association test, a variance component based testing method, to test for the association between a phenotype and multiple variants in an SNP set jointly using family samples. The proposed approach allows for both continuous and discrete traits, where the correlation among family members is taken into account through the use of an empirical covariance estimator. We derive the theoretical distribution of the proposed statistic under the null and develop analytical methods to calculate the P-values. We also propose an efficient resampling method for correcting for small sample size bias in family studies. The proposed method allows for easily incorporating covariates and SNP-SNP interactions. Simulation studies show that the proposed method properly controls for type I error rates under both random and ascertained sampling schemes in family studies. We demonstrate through simulation studies that our approach has superior performance for association mapping compared to the single marker based minimum P-value GEE test for an SNP-set effect over a range of scenarios. We illustrate the application of the proposed method using data from the Cleveland Family GWAS Study. © 2013 WILEY PERIODICALS, INC.
Prediction-error variance in Bayesian model updating: a comparative study
NASA Astrophysics Data System (ADS)
Asadollahi, Parisa; Li, Jian; Huang, Yong
2017-04-01
In Bayesian model updating, the likelihood function is commonly formulated by stochastic embedding in which the maximum information entropy probability model of prediction error variances plays an important role and it is Gaussian distribution subject to the first two moments as constraints. The selection of prediction error variances can be formulated as a model class selection problem, which automatically involves a trade-off between the average data-fit of the model class and the information it extracts from the data. Therefore, it is critical for the robustness in the updating of the structural model especially in the presence of modeling errors. To date, three ways of considering prediction error variances have been seem in the literature: 1) setting constant values empirically, 2) estimating them based on the goodness-of-fit of the measured data, and 3) updating them as uncertain parameters by applying Bayes' Theorem at the model class level. In this paper, the effect of different strategies to deal with the prediction error variances on the model updating performance is investigated explicitly. A six-story shear building model with six uncertain stiffness parameters is employed as an illustrative example. Transitional Markov Chain Monte Carlo is used to draw samples of the posterior probability density function of the structure model parameters as well as the uncertain prediction variances. The different levels of modeling uncertainty and complexity are modeled through three FE models, including a true model, a model with more complexity, and a model with modeling error. Bayesian updating is performed for the three FE models considering the three aforementioned treatments of the prediction error variances. The effect of number of measurements on the model updating performance is also examined in the study. The results are compared based on model class assessment and indicate that updating the prediction error variances as uncertain parameters at the model class level produces more robust results especially when the number of measurement is small.
Benthic macroinvertebrate field sampling effort required to ...
This multi-year pilot study evaluated a proposed field method for its effectiveness in the collection of a benthic macroinvertebrate sample adequate for use in the condition assessment of streams and rivers in the Neuquén Province, Argentina. A total of 13 sites, distributed across three rivers, were sampled. At each site, benthic macroinvertebrates were collected at 11 transects. Each sample was processed independently in the field and laboratory. Based on a literature review and resource considerations, the collection of 300 organisms (minimum) at each site was determined to be necessary to support a robust condition assessment, and therefore, selected as the criterion for judging the adequacy of the method. This targeted number of organisms was collected at all sites, at a minimum, when collections from all 11 transects were combined. Subsequent bootstrapping analysis of data was used to estimate whether collecting at fewer transects would reach the minimum target number of organisms for all sites. In a subset of sites, the total number of organisms frequently fell below the target when fewer than 11 transects collections were combined.Site conditions where <300 organisms might be collected are discussed. These preliminary results suggest that the proposed field method results in a sample that is adequate for robust condition assessment of the rivers and streams of interest. When data become available from a broader range of sites, the adequacy of the field
Genetic and environmental variance in content dimensions of the MMPI.
Rose, R J
1988-08-01
To evaluate genetic and environmental variance in the Minnesota Multiphasic Personality Inventory (MMPI), I studied nine factor scales identified in the first item factor analysis of normal adult MMPIs in a sample of 820 adolescent and young adult co-twins. Conventional twin comparisons documented heritable variance in six of the nine MMPI factors (Neuroticism, Psychoticism, Extraversion, Somatic Complaints, Inadequacy, and Cynicism), whereas significant influence from shared environmental experience was found for four factors (Masculinity versus Femininity, Extraversion, Religious Orthodoxy, and Intellectual Interests). Genetic variance in the nine factors was more evident in results from twin sisters than those of twin brothers, and a developmental-genetic analysis, using hierarchical multiple regressions of double-entry matrixes of the twins' raw data, revealed that in four MMPI factor scales, genetic effects were significantly modulated by age or gender or their interaction during the developmental period from early adolescence to early adulthood.
Measuring kinetics of complex single ion channel data using mean-variance histograms.
Patlak, J B
1993-01-01
The measurement of single ion channel kinetics is difficult when those channels exhibit subconductance events. When the kinetics are fast, and when the current magnitudes are small, as is the case for Na+, Ca2+, and some K+ channels, these difficulties can lead to serious errors in the estimation of channel kinetics. I present here a method, based on the construction and analysis of mean-variance histograms, that can overcome these problems. A mean-variance histogram is constructed by calculating the mean current and the current variance within a brief "window" (a set of N consecutive data samples) superimposed on the digitized raw channel data. Systematic movement of this window over the data produces large numbers of mean-variance pairs which can be assembled into a two-dimensional histogram. Defined current levels (open, closed, or sublevel) appear in such plots as low variance regions. The total number of events in such low variance regions is estimated by curve fitting and plotted as a function of window width. This function decreases with the same time constants as the original dwell time probability distribution for each of the regions. The method can therefore be used: 1) to present a qualitative summary of the single channel data from which the signal-to-noise ratio, open channel noise, steadiness of the baseline, and number of conductance levels can be quickly determined; 2) to quantify the dwell time distribution in each of the levels exhibited. In this paper I present the analysis of a Na+ channel recording that had a number of complexities. The signal-to-noise ratio was only about 8 for the main open state, open channel noise, and fast flickers to other states were present, as were a substantial number of subconductance states. "Standard" half-amplitude threshold analysis of these data produce open and closed time histograms that were well fitted by the sum of two exponentials, but with apparently erroneous time constants, whereas the mean-variance histogram technique provided a more credible analysis of the open, closed, and subconductance times for the patch. I also show that the method produces accurate results on simulated data in a wide variety of conditions, whereas the half-amplitude method, when applied to complex simulated data shows the same errors as were apparent in the real data. The utility and the limitations of this new method are discussed. Images FIGURE 2 FIGURE 4 FIGURE 8 FIGURE 9 PMID:7690261
Olivoto, T; Nardino, M; Carvalho, I R; Follmann, D N; Ferrari, M; Szareski, V J; de Pelegrin, A J; de Souza, V Q
2017-03-22
Methodologies using restricted maximum likelihood/best linear unbiased prediction (REML/BLUP) in combination with sequential path analysis in maize are still limited in the literature. Therefore, the aims of this study were: i) to use REML/BLUP-based procedures in order to estimate variance components, genetic parameters, and genotypic values of simple maize hybrids, and ii) to fit stepwise regressions considering genotypic values to form a path diagram with multi-order predictors and minimum multicollinearity that explains the relationships of cause and effect among grain yield-related traits. Fifteen commercial simple maize hybrids were evaluated in multi-environment trials in a randomized complete block design with four replications. The environmental variance (78.80%) and genotype-vs-environment variance (20.83%) accounted for more than 99% of the phenotypic variance of grain yield, which difficult the direct selection of breeders for this trait. The sequential path analysis model allowed the selection of traits with high explanatory power and minimum multicollinearity, resulting in models with elevated fit (R 2 > 0.9 and ε < 0.3). The number of kernels per ear (NKE) and thousand-kernel weight (TKW) are the traits with the largest direct effects on grain yield (r = 0.66 and 0.73, respectively). The high accuracy of selection (0.86 and 0.89) associated with the high heritability of the average (0.732 and 0.794) for NKE and TKW, respectively, indicated good reliability and prospects of success in the indirect selection of hybrids with high-yield potential through these traits. The negative direct effect of NKE on TKW (r = -0.856), however, must be considered. The joint use of mixed models and sequential path analysis is effective in the evaluation of maize-breeding trials.
Claw length recommendations for dairy cow foot trimming
Archer, S. C.; Newsome, R.; Dibble, H.; Sturrock, C. J.; Chagunda, M. G. G.; Mason, C. S.; Huxley, J. N.
2015-01-01
The aim was to describe variation in length of the dorsal hoof wall in contact with the dermis for cows on a single farm, and hence, derive minimum appropriate claw lengths for routine foot trimming. The hind feet of 68 Holstein-Friesian dairy cows were collected post mortem, and the internal structures were visualised using x-ray µCT. The internal distance from the proximal limit of the wall horn to the distal tip of the dermis was measured from cross-sectional sagittal images. A constant was added to allow for a minimum sole thickness of 5 mm and an average wall thickness of 8 mm. Data were evaluated using descriptive statistics and two-level linear regression models with claw nested within cow. Based on 219 claws, the recommended dorsal wall length from the proximal limit of hoof horn was up to 90 mm for 96 per cent of claws, and the median value was 83 mm. Dorsal wall length increased by 1 mm per year of age, yet 85 per cent of the null model variance remained unexplained. Overtrimming can have severe consequences; the authors propose that the minimum recommended claw length stated in training materials for all Holstein-Friesian cows should be increased to 90 mm. PMID:26220848
NASA Astrophysics Data System (ADS)
Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd
2017-08-01
The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.
Cheung, Chris C P; Yu, Alfred C H; Salimi, Nazila; Yiu, Billy Y S; Tsang, Ivan K H; Kerby, Benjamin; Azar, Reza Zahiri; Dickie, Kris
2012-02-01
The lack of open access to the pre-beamformed data of an ultrasound scanner has limited the research of novel imaging methods to a few privileged laboratories. To address this need, we have developed a pre-beamformed data acquisition (DAQ) system that can collect data over 128 array elements in parallel from the Ultrasonix series of research-purpose ultrasound scanners. Our DAQ system comprises three system-level blocks: 1) a connector board that interfaces with the array probe and the scanner through a probe connector port; 2) a main board that triggers DAQ and controls data transfer to a computer; and 3) four receiver boards that are each responsible for acquiring 32 channels of digitized raw data and storing them to the on-board memory. This system can acquire pre-beamformed data with 12-bit resolution when using a 40-MHz sampling rate. It houses a 16 GB RAM buffer that is sufficient to store 128 channels of pre-beamformed data for 8000 to 25 000 transmit firings, depending on imaging depth; corresponding to nearly a 2-s period in typical imaging setups. Following the acquisition, the data can be transferred through a USB 2.0 link to a computer for offline processing and analysis. To evaluate the feasibility of using the DAQ system for advanced imaging research, two proof-of-concept investigations have been conducted on beamforming and plane-wave B-flow imaging. Results show that adaptive beamforming algorithms such as the minimum variance approach can generate sharper images of a wire cross-section whose diameter is equal to the imaging wavelength (150 μm in our example). Also, planewave B-flow imaging can provide more consistent visualization of blood speckle movement given the higher temporal resolution of this imaging approach (2500 fps in our example).
Location tests for biomarker studies: a comparison using simulations for the two-sample case.
Scheinhardt, M O; Ziegler, A
2013-01-01
Gene, protein, or metabolite expression levels are often non-normally distributed, heavy tailed and contain outliers. Standard statistical approaches may fail as location tests in this situation. In three Monte-Carlo simulation studies, we aimed at comparing the type I error levels and empirical power of standard location tests and three adaptive tests [O'Gorman, Can J Stat 1997; 25: 269 -279; Keselman et al., Brit J Math Stat Psychol 2007; 60: 267- 293; Szymczak et al., Stat Med 2013; 32: 524 - 537] for a wide range of distributions. We simulated two-sample scenarios using the g-and-k-distribution family to systematically vary tail length and skewness with identical and varying variability between groups. All tests kept the type I error level when groups did not vary in their variability. The standard non-parametric U-test performed well in all simulated scenarios. It was outperformed by the two non-parametric adaptive methods in case of heavy tails or large skewness. Most tests did not keep the type I error level for skewed data in the case of heterogeneous variances. The standard U-test was a powerful and robust location test for most of the simulated scenarios except for very heavy tailed or heavy skewed data, and it is thus to be recommended except for these cases. The non-parametric adaptive tests were powerful for both normal and non-normal distributions under sample variance homogeneity. But when sample variances differed, they did not keep the type I error level. The parametric adaptive test lacks power for skewed and heavy tailed distributions.
ERIC Educational Resources Information Center
Kistner, Emily O.; Muller, Keith E.
2004-01-01
Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact…
Estimating the theoretical semivariogram from finite numbers of measurements
Zheng, Li; Silliman, Stephen E.
2000-01-01
We investigate from a theoretical basis the impacts of the number, location, and correlation among measurement points on the quality of an estimate of the semivariogram. The unbiased nature of the semivariogram estimator ŷ(r) is first established for a general random process Z(x). The variance of ŷZ(r) is then derived as a function of the sampling parameters (the number of measurements and their locations). In applying this function to the case of estimating the semivariograms of the transmissivity and the hydraulic head field, it is shown that the estimation error depends on the number of the data pairs, the correlation among the data pairs (which, in turn, are determined by the form of the underlying semivariogram γ(r)), the relative locations of the data pairs, and the separation distance at which the semivariogram is to be estimated. Thus design of an optimal sampling program for semivariogram estimation should include consideration of each of these factors. Further, the function derived for the variance of ŷZ(r) is useful in determining the reliability of a semivariogram developed from a previously established sampling design.
Gender differences in psychosocial predictors of texting while driving.
Struckman-Johnson, Cindy; Gaster, Samuel; Struckman-Johnson, Dave; Johnson, Melissa; May-Shinagle, Gabby
2015-01-01
A sample of 158 male and 357 female college students at a midwestern university participated in an on-line study of psychosocial motives for texting while driving. Men and women did not differ in self-reported ratings of how often they texted while driving. However, more women sent texts of less than a sentence while more men sent texts of 1-5 sentences. More women than men said they would quit texting while driving due to police warnings, receiving information about texting dangers, being shown graphic pictures of texting accidents, and being in a car accident. A hierarchical regression for men's data revealed that lower levels of feeling distracted by texting while driving (20% of the variance), higher levels of cell phone dependence (11.5% of the variance), risky behavioral tendencies (6.5% of the variance) and impulsivity (2.3%) of the variance) were significantly associated with more texting while driving (total model variance=42%). A separate regression for women revealed that higher levels of cell phone dependence (10.4% of the variance), risky behavioral tendencies (9.9% of the variance), texting distractibility (6.2%), crash risk estimates (2.2% of the variance) and driving confidence (1.3% of the variance) were significantly associated with more texting while driving (total model variance=31%.) Friendship potential and need for intimacy were not related to men's or women's texting while driving. Implications of the results for gender-specific prevention strategies are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Salvucci, G.; Rigden, A. J.; Gentine, P.; Lintner, B. R.
2013-12-01
A new method was recently proposed for estimating evapotranspiration (ET) from weather station data without requiring measurements of surface limiting factors (e.g. soil moisture, leaf area, canopy conductance) [Salvucci and Gentine, 2013, PNAS, 110(16): 6287-6291]. Required measurements include diurnal air temperature, specific humidity, wind speed, net shortwave radiation, and either measured or estimated incoming longwave radiation and ground heat flux. The approach is built around the idea that the key, rate-limiting, parameter of typical ET models, the land-surface resistance to water vapor transport, can be estimated from an emergent relationship between the diurnal cycle of the relative humidity profile and ET. The emergent relation is that the vertical variance of the relative humidity profile is less than what would occur for increased or decreased evaporation rates, suggesting that land-atmosphere feedback processes minimize this variance. This relation was found to hold over a wide range of climate conditions (arid to humid) and limiting factors (soil moisture, leaf area, energy) at a set of Ameriflux field sites. While the field tests in Salvucci and Gentine (2013) supported the minimum variance hypothesis, the analysis did not reveal the mechanisms responsible for the behavior. Instead the paper suggested, heuristically, that the results were due to an equilibration of the relative humidity between the land surface and the surface layer of the boundary layer. Here we apply this method using surface meteorological fields simulated by a global climate model (GCM), and compare the predicted ET to that simulated by the climate model. Similar to the field tests, the GCM simulated ET is in agreement with that predicted by minimizing the profile relative humidity variance. A reasonable interpretation of these results is that the feedbacks responsible for the minimization of the profile relative humidity variance in nature are represented in the climate model. The climate model components, in particular the land surface model and boundary layer representation, can thus be analyzed in controlled numerical experiments to discern the specific processes leading to the observed behavior. Results of this analysis will be presented.
2012-01-01
Background To investigate whether different conditions of DNA structure and radiation treatment could modify heterogeneity of response. Additionally to study variance as a potential parameter of heterogeneity for radiosensitivity testing. Methods Two-hundred leukocytes per sample of healthy donors were split into four groups. I: Intact chromatin structure; II: Nucleoids of histone-depleted DNA; III: Nucleoids of histone-depleted DNA with 90 mM DMSO as antioxidant. Response to single (I-III) and twice (IV) irradiation with 4 Gy and repair kinetics were evaluated using %Tail-DNA. Heterogeneity of DNA damage was determined by calculation of variance of DNA-damage (V) and mean variance (Mvar), mutual comparisons were done by one-way analysis of variance (ANOVA). Results Heterogeneity of initial DNA-damage (I, 0 min repair) increased without histones (II). Absence of histones was balanced by addition of antioxidants (III). Repair reduced heterogeneity of all samples (with and without irradiation). However double irradiation plus repair led to a higher level of heterogeneity distinguishable from single irradiation and repair in intact cells. Increase of mean DNA damage was associated with a similarly elevated variance of DNA damage (r = +0.88). Conclusions Heterogeneity of DNA-damage can be modified by histone level, antioxidant concentration, repair and radiation dose and was positively correlated with DNA damage. Experimental conditions might be optimized by reducing scatter of comet assay data by repair and antioxidants, potentially allowing better discrimination of small differences. Amount of heterogeneity measured by variance might be an additional useful parameter to characterize radiosensitivity. PMID:22520045
Regional melt-pond fraction and albedo of thin Arctic first-year drift ice in late summer
NASA Astrophysics Data System (ADS)
Divine, D. V.; Granskog, M. A.; Hudson, S. R.; Pedersen, C. A.; Karlsen, T. I.; Divina, S. A.; Renner, A. H. H.; Gerland, S.
2015-02-01
The paper presents a case study of the regional (≈150 km) morphological and optical properties of a relatively thin, 70-90 cm modal thickness, first-year Arctic sea ice pack in an advanced stage of melt. The study combines in situ broadband albedo measurements representative of the four main surface types (bare ice, dark melt ponds, bright melt ponds and open water) and images acquired by a helicopter-borne camera system during ice-survey flights. The data were collected during the 8-day ICE12 drift experiment carried out by the Norwegian Polar Institute in the Arctic, north of Svalbard at 82.3° N, from 26 July to 3 August 2012. A set of > 10 000 classified images covering about 28 km2 revealed a homogeneous melt across the study area with melt-pond coverage of ≈ 0.29 and open-water fraction of ≈ 0.11. A decrease in pond fractions observed in the 30 km marginal ice zone (MIZ) occurred in parallel with an increase in open-water coverage. The moving block bootstrap technique applied to sequences of classified sea-ice images and albedo of the four surface types yielded a regional albedo estimate of 0.37 (0.35; 0.40) and regional sea-ice albedo of 0.44 (0.42; 0.46). Random sampling from the set of classified images allowed assessment of the aggregate scale of at least 0.7 km2 for the study area. For the current setup configuration it implies a minimum set of 300 images to process in order to gain adequate statistics on the state of the ice cover. Variance analysis also emphasized the importance of longer series of in situ albedo measurements conducted for each surface type when performing regional upscaling. The uncertainty in the mean estimates of surface type albedo from in situ measurements contributed up to 95% of the variance of the estimated regional albedo, with the remaining variance resulting from the spatial inhomogeneity of sea-ice cover.
Feasibility of digital image colorimetry--application for water calcium hardness determination.
Lopez-Molinero, Angel; Tejedor Cubero, Valle; Domingo Irigoyen, Rosa; Sipiera Piazuelo, Daniel
2013-01-15
Interpretation and relevance of basic RGB colors in Digital Image-Based Colorimetry have been treated in this paper. The studies were carried out using the chromogenic model formed by the reaction between Ca(II) ions and glyoxal bis(2-hydroxyanil). It produced orange-red colored solutions in alkaline media. Individual basic color data (RGB) and also the total intensity of colors, I(tot), were the original variables treated by Factorial Analysis. Te evaluation evidenced that the highest variance of the system and the highest analytical sensitivity were associated to the G color. However, after the study by Fourier transform the basic R color was recognized as an important feature in the information. It was manifested as an intrinsic characteristic that appeared differentiated in terms of low frequency in Fourier transform. The Principal Components Analysis study showed that the variance of the system could be mostly retained in the first principal component, but was dependent on all basic colors. The colored complex was also applied and validated as a Digital Image Colorimetric method for the determination of Ca(II) ions. RGB intensities were linearly correlated with Ca(II) in the range 0.2-2.0 mg L(-1). In the best conditions, using green color, a simple and reliable method for Ca determination could be developed. Its detection limit was established (criterion 3s) as 0.07 mg L(-1). And the reproducibility was lower than 6%, for 1.0 mg L(-1) Ca. Other chromatic parameters were evaluated as dependent calibration variables. Their representativeness, variance and sensitivity were discussed in order to select the best analytical variable. The potentiality of the procedure as a field and ready-to-use method, susceptible to be applied 'in situ' with a minimum of experimental needs, was probed. Applications of the analysis of Ca in different real water samples were carried out. Water of the city net, mineral bottled, and natural-river were analyzed and results were compared and evaluated statistically. The validity was assessed by the alternative techniques of flame atomic absorption spectroscopy and titrimetry. Differences were appreciated but they were consistent with the applied methods. Copyright © 2012 Elsevier B.V. All rights reserved.
Sampling design for spatially distributed hydrogeologic and environmental processes
Christakos, G.; Olea, R.A.
1992-01-01
A methodology for the design of sampling networks over space is proposed. The methodology is based on spatial random field representations of nonhomogeneous natural processes, and on optimal spatial estimation techniques. One of the most important results of random field theory for physical sciences is its rationalization of correlations in spatial variability of natural processes. This correlation is extremely important both for interpreting spatially distributed observations and for predictive performance. The extent of site sampling and the types of data to be collected will depend on the relationship of subsurface variability to predictive uncertainty. While hypothesis formulation and initial identification of spatial variability characteristics are based on scientific understanding (such as knowledge of the physics of the underlying phenomena, geological interpretations, intuition and experience), the support offered by field data is statistically modelled. This model is not limited by the geometric nature of sampling and covers a wide range in subsurface uncertainties. A factorization scheme of the sampling error variance is derived, which possesses certain atttactive properties allowing significant savings in computations. By means of this scheme, a practical sampling design procedure providing suitable indices of the sampling error variance is established. These indices can be used by way of multiobjective decision criteria to obtain the best sampling strategy. Neither the actual implementation of the in-situ sampling nor the solution of the large spatial estimation systems of equations are necessary. The required values of the accuracy parameters involved in the network design are derived using reference charts (readily available for various combinations of data configurations and spatial variability parameters) and certain simple yet accurate analytical formulas. Insight is gained by applying the proposed sampling procedure to realistic examples related to sampling problems in two dimensions. ?? 1992.
FORTRAN implementation of Friedman's test for several related samples
NASA Technical Reports Server (NTRS)
Davidson, S. A.
1982-01-01
The FRIEDMAN program is a FORTRAN-coded implementation of Friedman's nonparametric test for several related samples with one observation per treatment/-block combination, or as it is sometimes called, the two-way analysis of variance by ranks. The FRIEDMAN program is described and a test data set and its results are presented to aid potential users of this program.
Jeran, S; Steinbrecher, A; Pischon, T
2016-08-01
Activity-related energy expenditure (AEE) might be an important factor in the etiology of chronic diseases. However, measurement of free-living AEE is usually not feasible in large-scale epidemiological studies but instead has traditionally been estimated based on self-reported physical activity. Recently, accelerometry has been proposed for objective assessment of physical activity, but it is unclear to what extent this methods explains the variance in AEE. We conducted a systematic review searching MEDLINE database (until 2014) on studies that estimated AEE based on accelerometry-assessed physical activity in adults under free-living conditions (using doubly labeled water method). Extracted study characteristics were sample size, accelerometer (type (uniaxial, triaxial), metrics (for example, activity counts, steps, acceleration), recording period, body position, wear time), explained variance of AEE (R(2)) and number of additional predictors. The relation of univariate and multivariate R(2) with study characteristics was analyzed using nonparametric tests. Nineteen articles were identified. Examination of various accelerometers or subpopulations in one article was treated separately, resulting in 28 studies. Sample sizes ranged from 10 to 149. In most studies the accelerometer was triaxial, worn at the trunk, during waking hours and reported activity counts as output metric. Recording periods ranged from 5 to 15 days. The variance of AEE explained by accelerometer-assessed physical activity ranged from 4 to 80% (median crude R(2)=26%). Sample size was inversely related to the explained variance. Inclusion of 1 to 3 other predictors in addition to accelerometer output significantly increased the explained variance to a range of 12.5-86% (median total R(2)=41%). The increase did not depend on the number of added predictors. We conclude that there is large heterogeneity across studies in the explained variance of AEE when estimated based on accelerometry. Thus, data on predicted AEE based on accelerometry-assessed physical activity need to be interpreted cautiously.
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Indirect estimation of signal-dependent noise with nonadaptive heterogeneous samples.
Azzari, Lucio; Foi, Alessandro
2014-08-01
We consider the estimation of signal-dependent noise from a single image. Unlike conventional algorithms that build a scatterplot of local mean-variance pairs from either small or adaptively selected homogeneous data samples, our proposed approach relies on arbitrarily large patches of heterogeneous data extracted at random from the image. We demonstrate the feasibility of our approach through an extensive theoretical analysis based on mixture of Gaussian distributions. A prototype algorithm is also developed in order to validate the approach on simulated data as well as on real camera raw images.
NASA Astrophysics Data System (ADS)
Harudin, N.; Jamaludin, K. R.; Muhtazaruddin, M. Nabil; Ramlie, F.; Muhamad, Wan Zuki Azman Wan
2018-03-01
T-Method is one of the techniques governed under Mahalanobis Taguchi System that developed specifically for multivariate data predictions. Prediction using T-Method is always possible even with very limited sample size. The user of T-Method required to clearly understanding the population data trend since this method is not considering the effect of outliers within it. Outliers may cause apparent non-normality and the entire classical methods breakdown. There exist robust parameter estimate that provide satisfactory results when the data contain outliers, as well as when the data are free of them. The robust parameter estimates of location and scale measure called Shamos Bickel (SB) and Hodges Lehman (HL) which are used as a comparable method to calculate the mean and standard deviation of classical statistic is part of it. Embedding these into T-Method normalize stage feasibly help in enhancing the accuracy of the T-Method as well as analysing the robustness of T-method itself. However, the result of higher sample size case study shows that T-method is having lowest average error percentages (3.09%) on data with extreme outliers. HL and SB is having lowest error percentages (4.67%) for data without extreme outliers with minimum error differences compared to T-Method. The error percentages prediction trend is vice versa for lower sample size case study. The result shows that with minimum sample size, which outliers always be at low risk, T-Method is much better on that, while higher sample size with extreme outliers, T-Method as well show better prediction compared to others. For the case studies conducted in this research, it shows that normalization of T-Method is showing satisfactory results and it is not feasible to adapt HL and SB or normal mean and standard deviation into it since it’s only provide minimum effect of percentages errors. Normalization using T-method is still considered having lower risk towards outlier’s effect.
Design features that affect the maneuverability of wheelchairs and scooters.
Koontz, Alicia M; Brindle, Eric D; Kankipati, Padmaja; Feathers, David; Cooper, Rory A
2010-05-01
To determine the minimum space required for wheeled mobility device users to perform 4 maneuverability tasks and to investigate the impact of selected design attributes on space. Case series. University laboratory, Veterans Affairs research facility, vocational training center, and a national wheelchair sport event. The sample of convenience included manual wheelchair (MWC; n=109), power wheelchair (PWC; n=100), and scooter users (n=14). A mock environment was constructed to create passageways to form an L-turn, 360 degrees -turn in place, and a U-turn with and without a barrier. Passageway openings were increased in 5-cm increments until the user could successfully perform each task without hitting the walls. Structural dimensions of the device and user were collected using an electromechanical probe. Mobility devices were grouped into categories based on design features and compared using 1-way analysis of variance and post hoc pairwise Bonferroni-corrected tests. Minimum passageway widths for the 4 maneuverability tasks. Ultralight MWCs with rear axles posterior to the shoulder had the shortest lengths and required the least amount of space compared with all other types of MWCs (P<.05). Mid-wheel-drive PWCs required the least space for the 360 degrees -turn in place compared with front-wheel-drive and rear-wheel-drive PWCs (P<.01) but performed equally as well as front-wheel-drive models on all other turning tasks. PWCs with seat functions required more space to perform the tasks. Between 10% and 100% of users would not be able to maneuver in spaces that meet current Accessibility Guidelines for Buildings and Facilities specifications. This study provides data that can be used to support wheelchair prescription and home modifications and to update standards to improve the accessibility of public areas.
Wu, Zhongchen; Chen, Huanwen; Wang, Weiling; Jia, Bin; Yang, Tianlin; Zhao, Zhanfeng; Ding, Jianhua; Xiao, Xuxian
2009-10-28
Without any sample pretreatment, mass spectral fingerprints of 486 dried sea cucumber slices were rapidly recorded in the mass range of m/z 50-800 by using surface desorption atmospheric pressure chemical ionization mass spectrometry (DAPCI-MS). A set of 162 individual sea cucumbers (Apostichopus japonicus Selenka) grown up in 3 different geographical regions (Weihai: 59 individuals, 177 slices; Yantai: 53 individuals, 159 slices; Dalian: 50 individuals, 150 slices;) in north China sea were successfully differentiated according to their habitats both by Principal Components Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) of the mass spectral raw data, demonstrating that DAPCI-MS is a practically convenient tool for high-throughput differentiation of sea cucumber products. It has been found that the difference between the body wall tissue and the epidermal tissue is heavily dependent on the habitats. The experimental data also show that the roughness of the sample surface contributes to the variance of the signal levels in a certain extent, but such variance does not fail the differentiation of the dried sea cucumber samples.
Dyadic Short Forms of the Wechsler Adult Intelligence Scale-IV.
Denney, David A; Ringe, Wendy K; Lacritz, Laura H
2015-08-01
Full Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV) administration can be time-consuming and may not be necessary when intelligence quotient estimates will suffice. Estimated Full Scale Intelligence Quotient (FSIQ) and General Ability Index (GAI) scores were derived from nine dyadic short forms using individual regression equations based on data from a clinical sample (n = 113) that was then cross validated in a separate clinical sample (n = 50). Derived scores accounted for 70%-83% of the variance in FSIQ and 77%-88% of the variance in GAI. Predicted FSIQs were strongly associated with actual FSIQ (rs = .73-.88), as were predicted and actual GAIs (rs = .80-.93). Each of the nine dyadic short forms of the WAIS-IV was a good predictor of FSIQ and GAI in the validation sample. These data support the validity of WAIS-IV short forms when time is limited or lengthier batteries cannot be tolerated by patients. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A BASIS FOR MODIFYING THE TANK 12 COMPOSITE SAMPLING DESIGN
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shine, G.
The SRR sampling campaign to obtain residual solids material from the Savannah River Site (SRS) Tank Farm Tank 12 primary vessel resulted in obtaining appreciable material in all 6 planned source samples from the mound strata but only in 5 of the 6 planned source samples from the floor stratum. Consequently, the design of the compositing scheme presented in the Tank 12 Sampling and Analysis Plan, Pavletich (2014a), must be revised. Analytical Development of SRNL statistically evaluated the sampling uncertainty associated with using various compositing arrays and splitting one or more samples for compositing. The variance of the simple meanmore » of composite sample concentrations is a reasonable standard to investigate the impact of the following sampling options. Composite Sample Design Option (a). Assign only 1 source sample from the floor stratum and 1 source sample from each of the mound strata to each of the composite samples. Each source sample contributes material to only 1 composite sample. Two source samples from the floor stratum would not be used. Composite Sample Design Option (b). Assign 2 source samples from the floor stratum and 1 source sample from each of the mound strata to each composite sample. This infers that one source sample from the floor must be used twice, with 2 composite samples sharing material from this particular source sample. All five source samples from the floor would be used. Composite Sample Design Option (c). Assign 3 source samples from the floor stratum and 1 source sample from each of the mound strata to each composite sample. This infers that several of the source samples from the floor stratum must be assigned to more than one composite sample. All 5 source samples from the floor would be used. Using fewer than 12 source samples will increase the sampling variability over that of the Basic Composite Sample Design, Pavletich (2013). Considering the impact to the variance of the simple mean of the composite sample concentrations, the recommendation is to construct each sample composite using four or five source samples. Although the variance using 5 source samples per composite sample (Composite Sample Design Option (c)) was slightly less than the variance using 4 source samples per composite sample (Composite Sample Design Option (b)), there is no practical difference between those variances. This does not consider that the measurement error variance, which is the same for all composite sample design options considered in this report, will further dilute any differences. Composite Sample Design Option (a) had the largest variance for the mean concentration in the three composite samples and should be avoided. These results are consistent with Pavletich (2014b) which utilizes a low elevation and a high elevation mound source sample and two floor source samples for each composite sample. Utilizing the four source samples per composite design, Pavletich (2014b) utilizes aliquots of Floor Sample 4 for two composite samples.« less
2014-03-27
42 4.2.3 Number of Hops Hs . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2.4 Number of Sensors M... 45 4.5 Standard deviation vs. Ns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.6 Bias...laboratory MTM multiple taper method MUSIC multiple signal classification MVDR minimum variance distortionless reposnse PSK phase shift keying QAM
Chiu, Mei Choi; Pun, Chi Seng; Wong, Hoi Ying
2017-08-01
Investors interested in the global financial market must analyze financial securities internationally. Making an optimal global investment decision involves processing a huge amount of data for a high-dimensional portfolio. This article investigates the big data challenges of two mean-variance optimal portfolios: continuous-time precommitment and constant-rebalancing strategies. We show that both optimized portfolios implemented with the traditional sample estimates converge to the worst performing portfolio when the portfolio size becomes large. The crux of the problem is the estimation error accumulated from the huge dimension of stock data. We then propose a linear programming optimal (LPO) portfolio framework, which applies a constrained ℓ 1 minimization to the theoretical optimal control to mitigate the risk associated with the dimensionality issue. The resulting portfolio becomes a sparse portfolio that selects stocks with a data-driven procedure and hence offers a stable mean-variance portfolio in practice. When the number of observations becomes large, the LPO portfolio converges to the oracle optimal portfolio, which is free of estimation error, even though the number of stocks grows faster than the number of observations. Our numerical and empirical studies demonstrate the superiority of the proposed approach. © 2017 Society for Risk Analysis.
Fast computation of an optimal controller for large-scale adaptive optics.
Massioni, Paolo; Kulcsár, Caroline; Raynaud, Henri-François; Conan, Jean-Marc
2011-11-01
The linear quadratic Gaussian regulator provides the minimum-variance control solution for a linear time-invariant system. For adaptive optics (AO) applications, under the hypothesis of a deformable mirror with instantaneous response, such a controller boils down to a minimum-variance phase estimator (a Kalman filter) and a projection onto the mirror space. The Kalman filter gain can be computed by solving an algebraic Riccati matrix equation, whose computational complexity grows very quickly with the size of the telescope aperture. This "curse of dimensionality" makes the standard solvers for Riccati equations very slow in the case of extremely large telescopes. In this article, we propose a way of computing the Kalman gain for AO systems by means of an approximation that considers the turbulence phase screen as the cropped version of an infinite-size screen. We demonstrate the advantages of the methods for both off- and on-line computational time, and we evaluate its performance for classical AO as well as for wide-field tomographic AO with multiple natural guide stars. Simulation results are reported.
The analysis of morphometric data on rocky mountain wolves and artic wolves using statistical method
NASA Astrophysics Data System (ADS)
Ammar Shafi, Muhammad; Saifullah Rusiman, Mohd; Hamzah, Nor Shamsidah Amir; Nor, Maria Elena; Ahmad, Noor’ani; Azia Hazida Mohamad Azmi, Nur; Latip, Muhammad Faez Ab; Hilmi Azman, Ahmad
2018-04-01
Morphometrics is a quantitative analysis depending on the shape and size of several specimens. Morphometric quantitative analyses are commonly used to analyse fossil record, shape and size of specimens and others. The aim of the study is to find the differences between rocky mountain wolves and arctic wolves based on gender. The sample utilised secondary data which included seven variables as independent variables and two dependent variables. Statistical modelling was used in the analysis such was the analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The results showed there exist differentiating results between arctic wolves and rocky mountain wolves based on independent factors and gender.
Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield
Robert B. Thomas
1986-01-01
Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...
How does variance in fertility change over the demographic transition?
Hruschka, Daniel J.; Burger, Oskar
2016-01-01
Most work on the human fertility transition has focused on declines in mean fertility. However, understanding changes in the variance of reproductive outcomes can be equally important for evolutionary questions about the heritability of fertility, individual determinants of fertility and changing patterns of reproductive skew. Here, we document how variance in completed fertility among women (45–49 years) differs across 200 surveys in 72 low- to middle-income countries where fertility transitions are currently in progress at various stages. Nearly all (91%) of samples exhibit variance consistent with a Poisson process of fertility, which places systematic, and often severe, theoretical upper bounds on the proportion of variance that can be attributed to individual differences. In contrast to the pattern of total variance, these upper bounds increase from high- to mid-fertility samples, then decline again as samples move from mid to low fertility. Notably, the lowest fertility samples often deviate from a Poisson process. This suggests that as populations move to low fertility their reproduction shifts from a rate-based process to a focus on an ideal number of children. We discuss the implications of these findings for predicting completed fertility from individual-level variables. PMID:27022082
Satellites for the study of ocean primary productivity
NASA Technical Reports Server (NTRS)
Smith, R. C.; Baker, K. S.
1983-01-01
The use of remote sensing techniques for obtaining estimates of global marine primary productivity is examined. It is shown that remote sensing and multiplatform (ship, aircraft, and satellite) sampling strategies can be used to significantly lower the variance in estimates of phytoplankton abundance and of population growth rates from the values obtained using the C-14 method. It is noted that multiplatform sampling strategies are essential to assess the mean and variance of phytoplankton biomass on a regional or on a global basis. The relative errors associated with shipboard and satellite estimates of phytoplankton biomass and primary productivity, as well as the increased statistical accuracy possible from the utilization of contemporaneous data from both sampling platforms, are examined. It is shown to be possible to follow changes in biomass and the distribution patterns of biomass as a function of time with the use of satellite imagery.
The Consequences of Indexing the Minimum Wage to Average Wages in the U.S. Economy.
ERIC Educational Resources Information Center
Macpherson, David A.; Even, William E.
The consequences of indexing the minimum wage to average wages in the U.S. economy were analyzed. The study data were drawn from the 1974-1978 May Current Population Survey (CPS) and the 180 monthly CPS Outgoing Rotation Group files for 1979-1993 (approximate annual sample sizes of 40,000 and 180,000, respectively). The effects of indexing on the…
Qu, Long; Guennel, Tobias; Marshall, Scott L
2013-12-01
Following the rapid development of genome-scale genotyping technologies, genetic association mapping has become a popular tool to detect genomic regions responsible for certain (disease) phenotypes, especially in early-phase pharmacogenomic studies with limited sample size. In response to such applications, a good association test needs to be (1) applicable to a wide range of possible genetic models, including, but not limited to, the presence of gene-by-environment or gene-by-gene interactions and non-linearity of a group of marker effects, (2) accurate in small samples, fast to compute on the genomic scale, and amenable to large scale multiple testing corrections, and (3) reasonably powerful to locate causal genomic regions. The kernel machine method represented in linear mixed models provides a viable solution by transforming the problem into testing the nullity of variance components. In this study, we consider score-based tests by choosing a statistic linear in the score function. When the model under the null hypothesis has only one error variance parameter, our test is exact in finite samples. When the null model has more than one variance parameter, we develop a new moment-based approximation that performs well in simulations. Through simulations and analysis of real data, we demonstrate that the new test possesses most of the aforementioned characteristics, especially when compared to existing quadratic score tests or restricted likelihood ratio tests. © 2013, The International Biometric Society.
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
23 CFR 1340.5 - Documentation requirements.
Code of Federal Regulations, 2010 CFR
2010-04-01
... STATE OBSERVATIONAL SURVEYS OF SEAT BELT USE § 1340.5 Documentation requirements. All sample design, data collection, and estimation procedures used in State surveys conducted in accordance with this part must be well documented. At a minimum, the documentation must: (a) For sample design— (1) Define all...
23 CFR 1340.5 - Documentation requirements.
Code of Federal Regulations, 2011 CFR
2011-04-01
... STATE OBSERVATIONAL SURVEYS OF SEAT BELT USE § 1340.5 Documentation requirements. All sample design, data collection, and estimation procedures used in State surveys conducted in accordance with this part must be well documented. At a minimum, the documentation must: (a) For sample design— (1) Define all...
Applications of GARCH models to energy commodities
NASA Astrophysics Data System (ADS)
Humphreys, H. Brett
This thesis uses GARCH methods to examine different aspects of the energy markets. The first part of the thesis examines seasonality in the variance. This study modifies the standard univariate GARCH models to test for seasonal components in both the constant and the persistence in natural gas, heating oil and soybeans. These commodities exhibit seasonal price movements and, therefore, may exhibit seasonal variances. In addition, the heating oil model is tested for a structural change in variance during the Gulf War. The results indicate the presence of an annual seasonal component in the persistence for all commodities. Out-of-sample volatility forecasting for natural gas outperforms standard forecasts. The second part of this thesis uses a multivariate GARCH model to examine volatility spillovers within the crude oil forward curve and between the London and New York crude oil futures markets. Using these results the effect of spillovers on dynamic hedging is examined. In addition, this research examines cointegration within the oil markets using investable returns rather than fixed prices. The results indicate the presence of strong volatility spillovers between both markets, weak spillovers from the front of the forward curve to the rest of the curve, and cointegration between the long term oil price on the two markets. The spillover dynamic hedge models lead to a marginal benefit in terms of variance reduction, but a substantial decrease in the variability of the dynamic hedge; thereby decreasing the transactions costs associated with the hedge. The final portion of the thesis uses portfolio theory to demonstrate how the energy mix consumed in the United States could be chosen given a national goal to reduce the risks to the domestic macroeconomy of unanticipated energy price shocks. An efficient portfolio frontier of U.S. energy consumption is constructed using a covariance matrix estimated with GARCH models. The results indicate that while the electric utility industry is operating close to the minimum variance position, a shift towards coal consumption would reduce price volatility for overall U.S. energy consumption. With the inclusion of potential externality costs, the shift remains away from oil but towards natural gas instead of coal.
Performance of the all-digital data-transition tracking loop in the advanced receiver
NASA Astrophysics Data System (ADS)
Cheng, U.; Hinedi, S.
1989-11-01
The performance of the all-digital data-transition tracking loop (DTTL) with coherent or noncoherent sampling is described. The effects of few samples per symbol and of noncommensurate sampling rates and symbol rates are addressed and analyzed. Their impacts on the loop phase-error variance and the mean time to lose lock (MTLL) are quantified through computer simulations. The analysis and preliminary simulations indicate that with three to four samples per symbol, the DTTL can track with negligible jitter because of the presence of earth Doppler rate. Furthermore, the MTLL is also expected to be large engough to maintain lock over a Deep Space Network track.
Factors associated to quality of life in active elderly.
Alexandre, Tiago da Silva; Cordeiro, Renata Cereda; Ramos, Luiz Roberto
2009-08-01
To analyze whether quality of life in active, healthy elderly individuals is influenced by functional status and sociodemographic characteristics, as well as psychological parameters. Study conducted in a sample of 120 active elderly subjects recruited from two open universities of the third age in the cities of São Paulo and São José dos Campos (Southeastern Brazil) between May 2005 and April 2006. Quality of life was measured using the abbreviated Brazilian version of the World Health Organization Quality of Live (WHOQOL-bref) questionnaire. Sociodemographic, clinical and functional variables were measured through crossculturally validated assessments by the Mini Mental State Examination, Geriatric Depression Scale, Functional Reach, One-Leg Balance Test, Timed Up and Go Test, Six-Minute Walk Test, Human Activity Profile and a complementary questionnaire. Simple descriptive analyses, Pearson's correlation coefficient, Student's t-test for non-related samples, analyses of variance, linear regression analyses and variance inflation factor were performed. The significance level for all statistical tests was set at 0.05. Linear regression analysis showed an independent correlation without colinearity between depressive symptoms measured by the Geriatric Depression Scale and four domains of the WHOQOL-bref. Not having a conjugal life implied greater perception in the social domain; developing leisure activities and having an income over five minimum wages implied greater perception in the environment domain. Functional status had no influence on the Quality of Life variable in the analysis models in active elderly. In contrast, psychological factors, as assessed by the Geriatric Depression Scale, and sociodemographic characteristics, such as marital status, income and leisure activities, had an impact on quality of life.
SU-F-T-18: The Importance of Immobilization Devices in Brachytherapy Treatments of Vaginal Cuff
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shojaei, M; Dumitru, N; Pella, S
2016-06-15
Purpose: High dose rate brachytherapy is a highly localized radiation therapy that has a very high dose gradient. Thus one of the most important parts of the treatment is the immobilization. The smallest movement of the patient or applicator can result in dose variation to the surrounding tissues as well as to the tumor to be treated. We will revise the ML Cylinder treatments and their localization challenges. Methods: A retrospective study of 25 patients with 5 treatments each looking into the applicator’s placement in regard to the organs at risk. Motion possibilities for each applicator intra and inter fractionationmore » with their dosimetric implications were covered and measured in regard with their dose variance. The localization immobilization devices used were assessed for the capability to prevent motion before and during the treatment delivery. Results: We focused on the 100% isodose on central axis and a 15 degree displacement due to possible rotation analyzing the dose variations to the bladder and rectum walls. The average dose variation for bladder was 15% of the accepted tolerance, with a minimum variance of 11.1% and a maximum one of 23.14% on the central axis. For the off axis measurements we found an average variation of 16.84% of the accepted tolerance, with a minimum variance of 11.47% and a maximum one of 27.69%. For the rectum we focused on the rectum wall closest to the 120% isodose line. The average dose variation was 19.4%, minimum 11.3% and a maximum of 34.02% from the accepted tolerance values Conclusion: Improved immobilization devices are recommended. For inter-fractionation, localization devices are recommended in place with consistent planning in regards with the initial fraction. Many of the present immobilization devices produced for external radiotherapy can be used to improve the localization of HDR applicators during transportation of the patient and during treatment.« less
Importance Sampling Variance Reduction in GRESS ATMOSIM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wakeford, Daniel Tyler
This document is intended to introduce the importance sampling method of variance reduction to a Geant4 user for application to neutral particle Monte Carlo transport through the atmosphere, as implemented in GRESS ATMOSIM.
McClure, Foster D; Lee, Jung K
2012-01-01
The validation process for an analytical method usually employs an interlaboratory study conducted as a balanced completely randomized model involving a specified number of randomly chosen laboratories, each analyzing a specified number of randomly allocated replicates. For such studies, formulas to obtain approximate unbiased estimates of the variance and uncertainty of the sample laboratory-to-laboratory (lab-to-lab) STD (S(L)) have been developed primarily to account for the uncertainty of S(L) when there is a need to develop an uncertainty budget that includes the uncertainty of S(L). For the sake of completeness on this topic, formulas to estimate the variance and uncertainty of the sample lab-to-lab variance (S(L)2) were also developed. In some cases, it was necessary to derive the formulas based on an approximate distribution for S(L)2.
NASA Astrophysics Data System (ADS)
Satyanarayanan, M.; Eswaramoorthi, S.; Subramanian, S.; Periakali, P.
2017-09-01
Geochemical analytical data of 15 representative rock samples, 34 soil samples and 55 groundwater samples collected from Salem magnesite mines and surrounding area in Salem, southern India, were subjected to R-mode factor analysis. A maximum of three factors account for 93.8 % variance in rock data, six factors for 84 % variance in soil data, five factors for 71.2 % in groundwater data during summer and six factors for 73.7 % during winter. Total dissolved solids are predominantly contributed by Mg, Na, Cl and SO4 ions in both seasons and are derived from the country rock and mining waste by dissolution of minerals like magnesite, gypsum, halite. The results also show that groundwater is enriched in considerable amount of minor and trace elements (Fe, Mn, Ni, Cr and Co). Nickel, chromium and cobalt in groundwater and soil are derived from leaching of huge mine dumps deposited by selective magnesite mining activity. The factor analysis on trivalent, hexavalent and total Cr in groundwater indicates that most of the Cr in summer is trivalent and in winter hexavalent. The gradational decrease in topographical elevation from northern mine area to the southern residential area, combined regional hydrogeological factors and distribution of ultramafic rocks in the northern part of the study area indicate that these toxic trace elements in water were derived from mine dumps.
... and imputed family income ( 10 ). Data source and methods All ADHD prevalence estimates were obtained from the ... sample design of NHIS. The Taylor series linearization method was chosen for variance estimation. Differences between percentages ...
Comparative test on several forms of background error covariance in 3DVar
NASA Astrophysics Data System (ADS)
Shao, Aimei
2013-04-01
The background error covariance matrix (Hereinafter referred to as B matrix) plays an important role in the three-dimensional variational (3DVar) data assimilation method. However, it is difficult to get B matrix accurately because true atmospheric state is unknown. Therefore, some methods were developed to estimate B matrix (e.g. NMC method, innovation analysis method, recursive filters, and ensemble method such as EnKF). Prior to further development and application of these methods, the function of several B matrixes estimated by these methods in 3Dvar is worth studying and evaluating. For this reason, NCEP reanalysis data and forecast data are used to test the effectiveness of the several B matrixes with VAF (Huang, 1999) method. Here the NCEP analysis is treated as the truth and in this case the forecast error is known. The data from 2006 to 2007 is used as the samples to estimate B matrix and the data in 2008 is used to verify the assimilation effects. The 48h and 24h forecast valid at the same time is used to estimate B matrix with NMC method. B matrix can be represented by a correlation part (a non-diagonal matrix) and a variance part (a diagonal matrix of variances). Gaussian filter function as an approximate approach is used to represent the variation of correlation coefficients with distance in numerous 3DVar systems. On the basis of the assumption, the following several forms of B matrixes are designed and test with VAF in the comparative experiments: (1) error variance and the characteristic lengths are fixed and setted to their mean value averaged over the analysis domain; (2) similar to (1), but the mean characteristic lengths reduce to 50 percent for the height and 60 percent for the temperature of the original; (3) similar to (2), but error variance calculated directly by the historical data is space-dependent; (4) error variance and characteristic lengths are all calculated directly by the historical data; (5) B matrix is estimated directly by the historical data; (6) similar to (5), but a localization process is performed; (7) B matrix is estimated by NMC method but error variance is reduced by 1.7 times in order that the value is close to that calculated from the true forecast error samples; (8) similar to (7), but the localization similar to (6) is performed. Experimental results with the different B matrixes show that for the Gaussian-type B matrix the characteristic lengths calculated from the true error samples don't bring a good analysis results. However, the reduced characteristic lengths (about half of the original one) can lead to a good analysis. If the B matrix estimated directly from the historical data is used in 3DVar, the assimilation effect can not reach to the best. The better assimilation results are generated with the application of reduced characteristic length and localization. Even so, it hasn't obvious advantage compared with Gaussian-type B matrix with the optimal characteristic length. It implies that the Gaussian-type B matrix, widely used for operational 3DVar system, can get a good analysis with the appropriate characteristic lengths. The crucial problem is how to determine the appropriate characteristic lengths. (This work is supported by the National Natural Science Foundation of China (41275102, 40875063), and the Fundamental Research Funds for the Central Universities (lzujbky-2010-9) )
Self-perception and value system as possible predictors of stress.
Sivberg, B
1998-03-01
This study was directed towards personality-related, value system and sociodemographic variables of nursing students in a situation of change, using a longitudinal perspective to measure their improvement in principle-based moral judgement (Kohlberg; Rest) as possible predictors of stress. Three subgroups of students were included from the commencement of the first three-year academic nursing programme in 1993. The students came from the colleges of health at Jönköping, Växjö and Kristianstad in the south of Sweden. A principal component factor analysis (varimax) was performed using data obtained from the students in the spring of 1994 (n = 122) and in the spring of 1996 (n = 112). There were 23 variables, of which two were sociodemographic, eight represented self-image, six were self-values, six were interpersonal values, and one was principle-based moral judgement. The analysis of data from students in the first year of a three-year programme demonstrated eight factors that explained 68.8% of the variance. The most important factors were: (1) ascendant decisive disorderly sociability and nonpractical mindedness (18.1% of the variance); (2) original vigour person-related trust (13.3%) of the variance); (3) orderly nonvigour achievement (8.9% of the variance) and (4) independent leadership (7.9% of the variance). (The term 'ascendancy' refers to self-confidence, and 'vigour' denotes responding well to challenges and coping with stress.) The analysis in 1996 demonstrated nine factors, of which the most important were: (1) ascendant original sociability with decisive nonconformist leadership (18.2% of the variance); (2) cautious person-related responsibility (12.6% of the variance); (3) orderly nonvariety achievement (8.4% of the variance); and (4) nonsupportive benevolent conformity (7.2% of the variance). A comparison of the two most prominent factors in 1994 and 1996 showed the process of change to be stronger for 18.2% and weaker for 30% of the variance. Principle-based moral judgement was measured in March 1994 and in May 1996, using the Swedish version of the Defining Issues Test and Index P. The result was that Index P for the students at Jönköping changed significantly (paired samples t-test) between 1994 and 1996 (p = 0.028), but that for the Växjö and Kristianstad students did not. The mean of Index P was 44.3% at Växjö, which was greater than the international average for college students (42.3%) it differed significantly in the spring of 1996 (independent samples t-test), but not in 1994, from the students at Jönköping (p = 0.032) and Kristianstad (p = 0.025). Index P was very heterogeneous for the group of students at Växjö, with the result that the paired samples t-test reached a value close to significance only. The conclusion of this study was that, if self-perception and value system are predictors of stress, only one-third of the students had improved their ability to cope with stress at the end of the programme. This article contains the author's application to the teaching process of reflecting on the structure of expectations in professional ethical relationships.
Data assimilation method based on the constraints of confidence region
NASA Astrophysics Data System (ADS)
Li, Yong; Li, Siming; Sheng, Yao; Wang, Luheng
2018-03-01
The ensemble Kalman filter (EnKF) is a distinguished data assimilation method that is widely used and studied in various fields including methodology and oceanography. However, due to the limited sample size or imprecise dynamics model, it is usually easy for the forecast error variance to be underestimated, which further leads to the phenomenon of filter divergence. Additionally, the assimilation results of the initial stage are poor if the initial condition settings differ greatly from the true initial state. To address these problems, the variance inflation procedure is usually adopted. In this paper, we propose a new method based on the constraints of a confidence region constructed by the observations, called EnCR, to estimate the inflation parameter of the forecast error variance of the EnKF method. In the new method, the state estimate is more robust to both the inaccurate forecast models and initial condition settings. The new method is compared with other adaptive data assimilation methods in the Lorenz-63 and Lorenz-96 models under various model parameter settings. The simulation results show that the new method performs better than the competing methods.
Gonçalves, M A D; Bello, N M; Dritz, S S; Tokach, M D; DeRouchey, J M; Woodworth, J C; Goodband, R D
2016-05-01
Advanced methods for dose-response assessments are used to estimate the minimum concentrations of a nutrient that maximizes a given outcome of interest, thereby determining nutritional requirements for optimal performance. Contrary to standard modeling assumptions, experimental data often present a design structure that includes correlations between observations (i.e., blocking, nesting, etc.) as well as heterogeneity of error variances; either can mislead inference if disregarded. Our objective is to demonstrate practical implementation of linear and nonlinear mixed models for dose-response relationships accounting for correlated data structure and heterogeneous error variances. To illustrate, we modeled data from a randomized complete block design study to evaluate the standardized ileal digestible (SID) Trp:Lys ratio dose-response on G:F of nursery pigs. A base linear mixed model was fitted to explore the functional form of G:F relative to Trp:Lys ratios and assess model assumptions. Next, we fitted 3 competing dose-response mixed models to G:F, namely a quadratic polynomial (QP) model, a broken-line linear (BLL) ascending model, and a broken-line quadratic (BLQ) ascending model, all of which included heteroskedastic specifications, as dictated by the base model. The GLIMMIX procedure of SAS (version 9.4) was used to fit the base and QP models and the NLMIXED procedure was used to fit the BLL and BLQ models. We further illustrated the use of a grid search of initial parameter values to facilitate convergence and parameter estimation in nonlinear mixed models. Fit between competing dose-response models was compared using a maximum likelihood-based Bayesian information criterion (BIC). The QP, BLL, and BLQ models fitted on G:F of nursery pigs yielded BIC values of 353.7, 343.4, and 345.2, respectively, thus indicating a better fit of the BLL model. The BLL breakpoint estimate of the SID Trp:Lys ratio was 16.5% (95% confidence interval [16.1, 17.0]). Problems with the estimation process rendered results from the BLQ model questionable. Importantly, accounting for heterogeneous variance enhanced inferential precision as the breadth of the confidence interval for the mean breakpoint decreased by approximately 44%. In summary, the article illustrates the use of linear and nonlinear mixed models for dose-response relationships accounting for heterogeneous residual variances, discusses important diagnostics and their implications for inference, and provides practical recommendations for computational troubleshooting.
Variable variance Preisach model for multilayers with perpendicular magnetic anisotropy
NASA Astrophysics Data System (ADS)
Franco, A. F.; Gonzalez-Fuentes, C.; Morales, R.; Ross, C. A.; Dumas, R.; Åkerman, J.; Garcia, C.
2016-08-01
We present a variable variance Preisach model that fully accounts for the different magnetization processes of a multilayer structure with perpendicular magnetic anisotropy by adjusting the evolution of the interaction variance as the magnetization changes. We successfully compare in a quantitative manner the results obtained with this model to experimental hysteresis loops of several [CoFeB/Pd ] n multilayers. The effect of the number of repetitions and the thicknesses of the CoFeB and Pd layers on the magnetization reversal of the multilayer structure is studied, and it is found that many of the observed phenomena can be attributed to an increase of the magnetostatic interactions and subsequent decrease of the size of the magnetic domains. Increasing the CoFeB thickness leads to the disappearance of the perpendicular anisotropy, and such a minimum thickness of the Pd layer is necessary to achieve an out-of-plane magnetization.
Li, Qing; Li, Xiaoming; Stanton, Bonita; Fang, Xiaoyi; Zhao, Ran
2010-11-01
Multilevel analytical techniques are being applied in condom use research to ensure the validity of investigation on environmental/structural influences and clustered data from venue-based sampling. The literature contains reports of consistent associations between perceived gatekeeper support and condom use among entertainment establishment-based female sex workers (FSWs) in Guangxi, China. However, the clustering inherent in the data (FSWs being clustered within establishment) has not been accounted in most of the analyses. We used multilevel analyses to examine perceived features of gatekeepers and individual correlates of consistent condom use among FSWs and to validate the findings in the existing literature. We analyzed cross-sectional data from 318 FSWs from 29 entertainment establishments in Guangxi, China in 2004, with a minimum of 5 FSWs per establishment. The Hierarchical Linear Models program with Laplace estimation was used to estimate the parameters in models containing random effects and binary outcomes. About 11.6% of women reported consistent condom use with clients. The intraclass correlation coefficient indicated 18.5% of the variance in condom use could be attributed to their similarity between FSWs within the same establishments. Women's perceived gatekeeper support and education remained positively associated with condom use (P < 0.05), after controlling for other individual characteristics and clustering. After adjusting for data clustering, perceived gatekeeper support remains associated with consistent condom use with clients among FSWs in China. The results imply that combined interventions to intervene both gatekeepers and individual FSW may effectively promote consistent condom use.
Li, Qing; Li, Xiaoming; Stanton, Bonita; Fang, Xiaoyi; Zhao, Ran
2010-01-01
Background Multilevel analytical techniques are being applied in condom use research to ensure the validity of investigation on environmental/structural influences and clustered data from venue-based sampling. The literature contains reports of consistent associations between perceived gatekeeper support and condom use among entertainments establishment-based female sex workers (FSWs) in Guangxi, China. However, the clustering inherent in the data (FSWs being clustered within establishment) has not been accounted in most of the analyses. We used multilevel analyses to examine perceived features of gatekeepers and individual correlates of consistent condom use among FSWs and to validate the findings in the existing literature. Methods We analyzed cross-sectional data from 318 FSWs from 29 entertainment establishments in Guangxi, China in 2004, with a minimum of 5 FSWs per establishment. The Hierarchical Linear Models program with Laplace estimation was used to estimate the parameters in models containing random effects and binary outcomes. Results About 11.6% of women reported consistent condom use with clients. The intraclass correlation coefficient indicated 18.5% of the variance in condom use could be attributed to their similarity between FSWs within the same establishments. Women’s perceived gatekeeper support and education remained positively associated with condom use (P < 0.05), after controlling for other individual characteristics and clustering. Conclusions After adjusting for data clustering, perceived gatekeeper support remains associated with consistent condom use with clients among FSWs in China. The results imply that combined interventions to intervene both gatekeepers and individual FSW may effectively promote consistent condom use. PMID:20539262
Zhang, Bao; Yao, Yibin; Fok, Hok Sum; Hu, Yufeng; Chen, Qiang
2016-01-01
This study uses the observed vertical displacements of Global Positioning System (GPS) time series obtained from the Crustal Movement Observation Network of China (CMONOC) with careful pre- and post-processing to estimate the seasonal crustal deformation in response to the hydrological loading in lower three-rivers headwater region of southwest China, followed by inferring the annual EWH changes through geodetic inversion methods. The Helmert Variance Component Estimation (HVCE) and the Minimum Mean Square Error (MMSE) criterion were successfully employed. The GPS inferred EWH changes agree well qualitatively with the Gravity Recovery and Climate Experiment (GRACE)-inferred and the Global Land Data Assimilation System (GLDAS)-inferred EWH changes, with a discrepancy of 3.2–3.9 cm and 4.8–5.2 cm, respectively. In the research areas, the EWH changes in the Lancang basin is larger than in the other regions, with a maximum of 21.8–24.7 cm and a minimum of 3.1–6.9 cm. PMID:27657064
Endogenous fluorescence emission of the ovary
NASA Astrophysics Data System (ADS)
Utzinger, Urs; Kirkpatrick, Nathaniel D.; Drezek, Rebekah A.; Brewer, Molly A.
2005-03-01
Epithelial ovarian cancer has the highest mortality rate among the gynecologic cancers. Early detection would significantly improve survival and quality of life of women at increased risk to develop ovarian cancer. We have constructed a device to investigate endogenous signals of the ovarian tissue surface in the UV C to visible range and describe our initial investigation of the use of optical spectroscopy to characterize the condition of the ovary. We have acquired data from more than 33 patients. A table top spectroscopy system was used to collect endogenous fluorescence with a fiberoptic probe that is compatible with endoscopic techniques. Samples were broken into five groups: Normal-Low Risk (for developing ovarian cancer) Normal-High Risk, Benign, and Cancer. Rigorous statistical analysis was applied to the data using variance tests for direct intensity versus diagnostic group comparisons and principal component analysis (PCA) to study the variance of the whole data set. We conclude that the diagnostically most useful excitation wavelengths are located in the UV. Furthermore, our results indicate that UV B and C are most useful. A safety analysis indicates that UV-C imaging can be conducted at exposure levels below safety thresholds. We found that fluorescence excited in the UV-C and UV-B range increases from benign to normal to cancerous tissues. This is in contrast to the emission created with UV-A excitation which decreased in the same order. We hypothesize that an increase of protein production and a decrease of fluorescence contributions of the extracellular matrix could explain this behavior. Variance analysis also identified fluctuation of fluorescence at 320/380 which is associated with collagen cross link residues. Small differences were observed between the group at high risk and normal risk for ovarian cancer. High risk samples deviated towards the cancer group and low risk samples towards benign group.
Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T
2016-12-20
Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Qualitative data analysis for an exploratory sensory study of Grechetto wine.
Esti, Marco; González Airola, Ricardo L; Moneta, Elisabetta; Paperaio, Marina; Sinesio, Fiorella
2010-02-15
Grechetto is a traditional white-grape vine, widespread in Umbria and Lazio regions in central Italy. Despite the wine commercial diffusion, little literature on its sensory characteristics is available. The present study is an exploratory research conducted with the aim of identifying the sensory markers of Grechetto wine and of evaluating the effect of clone, geographical area, vintage and producer on sensory attributes. A qualitative sensory study was conducted on 16 wines, differing for vintage, Typical Geographic Indication, and clone, collected from 7 wineries, using a trained panel in isolation who referred to a glossary of 133 white wine descriptors. Sixty-five attributes identified by a minimum of 50% of the respondents were submitted to a correspondence analysis to link wine samples to the sensory attributes. Seventeen terms identified as common to all samples are considered as characteristics of Grechetto wine, 10 of which olfactory: fruity, apple, acacia flower, pineapple, banana, floral, herbaceous, honey, apricot and peach. In order to interpret the relationship between design variables and sensory attributes data on 2005 and 2006 wines, the 28 most discriminating descriptors were projected in a principal component analysis. The first principal component was best described by olfactory terms and the second by gustative attributes. Good reproducibility of results was obtained for the two vintages. For one winery, vintage effect (2002-2006) was described in a new principal component analysis model applied on 39 most discriminating descriptors, which globally explained about 84% of the variance. In the young wines the notes of sulphur, yeast, dried fruit, butter, combined with herbaceous fresh and tropical fruity notes (melon, grapefruit) were dominant. During wine aging, sweeter notes, like honey, caramel, jam, become more dominant as well as some mineral notes, such as tuff and flint. Copyright 2009 Elsevier B.V. All rights reserved.
Experimental study on an FBG strain sensor
NASA Astrophysics Data System (ADS)
Liu, Hong-lin; Zhu, Zheng-wei; Zheng, Yong; Liu, Bang; Xiao, Feng
2018-01-01
Landslides and other geological disasters occur frequently and often cause high financial and humanitarian cost. The real-time, early-warning monitoring of landslides has important significance in reducing casualties and property losses. In this paper, by taking the high initial precision and high sensitivity advantage of FBG, an FBG strain sensor is designed combining FBGs with inclinometer. The sensor was regarded as a cantilever beam with one end fixed. According to the anisotropic material properties of the inclinometer, a theoretical formula between the FBG wavelength and the deflection of the sensor was established using the elastic mechanics principle. Accuracy of the formula established had been verified through laboratory calibration testing and model slope monitoring experiments. The displacement of landslide could be calculated by the established theoretical formula using the changing values of FBG central wavelength obtained by the demodulation instrument remotely. Results showed that the maximum error at different heights was 9.09%; the average of the maximum error was 6.35%, and its corresponding variance was 2.12; the minimum error was 4.18%; the average of the minimum error was 5.99%, and its corresponding variance was 0.50. The maximum error of the theoretical and the measured displacement decrease gradually, and the variance of the error also decreases gradually. This indicates that the theoretical results are more and more reliable. It also shows that the sensor and the theoretical formula established in this paper can be used for remote, real-time, high precision and early warning monitoring of the slope.
Standard Deviation for Small Samples
ERIC Educational Resources Information Center
Joarder, Anwar H.; Latif, Raja M.
2006-01-01
Neater representations for variance are given for small sample sizes, especially for 3 and 4. With these representations, variance can be calculated without a calculator if sample sizes are small and observations are integers, and an upper bound for the standard deviation is immediate. Accessible proofs of lower and upper bounds are presented for…
Noise level in a neonatal intensive care unit in Santa Marta - Colombia.
Garrido Galindo, Angélica Patricia; Camargo Caicedo, Yiniva; Velez-Pereira, Andres M
2017-09-30
The environment of neonatal intensive care units is influenced by numerous sources of noise emission, which contribute to raise the noise levels, and may cause hearing impairment and other physiological and psychological changes on the newborn, as well as problems with care staff. To evaluate the level and sources of noise in the neonatal intensive care unit. Sampled for 20 consecutive days every 60 seconds in A-weighting curves and fast mode with a Type I sound level meter. Recorded the average, maximum and minimum, and the 10th, 50th and 90th percentiles. The values are integrated into hours and work shift, and studied by analysis of variance. The sources were characterized in thirds of octaves. The average level was 64.00 ±3.62 dB(A), with maximum of 76.04 ±5.73 dB(A), minimum of 54.84 ±2.61dB(A), and background noise of 57.95 ±2.83 dB(A). We found four sources with levels between 16.8-63.3 dB(A). Statistical analysis showed significant differences between the hours and work shift, with higher values in the early hours of the day. The values presented exceed the standards suggested by several organizations. The sources identified and measured recorded high values in low frequencies.
Terluin, Berend; de Boer, Michiel R; de Vet, Henrica C W
2016-01-01
The network approach to psychopathology conceives mental disorders as sets of symptoms causally impacting on each other. The strengths of the connections between symptoms are key elements in the description of those symptom networks. Typically, the connections are analysed as linear associations (i.e., correlations or regression coefficients). However, there is insufficient awareness of the fact that differences in variance may account for differences in connection strength. Differences in variance frequently occur when subgroups are based on skewed data. An illustrative example is a study published in PLoS One (2013;8(3):e59559) that aimed to test the hypothesis that the development of psychopathology through "staging" was characterized by increasing connection strength between mental states. Three mental states (negative affect, positive affect, and paranoia) were studied in severity subgroups of a general population sample. The connection strength was found to increase with increasing severity in six of nine models. However, the method used (linear mixed modelling) is not suitable for skewed data. We reanalysed the data using inverse Gaussian generalized linear mixed modelling, a method suited for positively skewed data (such as symptoms in the general population). The distribution of positive affect was normal, but the distributions of negative affect and paranoia were heavily skewed. The variance of the skewed variables increased with increasing severity. Reanalysis of the data did not confirm increasing connection strength, except for one of nine models. Reanalysis of the data did not provide convincing evidence in support of staging as characterized by increasing connection strength between mental states. Network researchers should be aware that differences in connection strength between symptoms may be caused by differences in variances, in which case they should not be interpreted as differences in impact of one symptom on another symptom.
Data Centric Sensor Stream Reduction for Real-Time Applications in Wireless Sensor Networks
Aquino, Andre Luiz Lins; Nakamura, Eduardo Freire
2009-01-01
This work presents a data-centric strategy to meet deadlines in soft real-time applications in wireless sensor networks. This strategy considers three main aspects: (i) The design of real-time application to obtain the minimum deadlines; (ii) An analytic model to estimate the ideal sample size used by data-reduction algorithms; and (iii) Two data-centric stream-based sampling algorithms to perform data reduction whenever necessary. Simulation results show that our data-centric strategies meet deadlines without loosing data representativeness. PMID:22303145
NASA Astrophysics Data System (ADS)
Loher, Timothy; Woods, Monica A.; Jimenez-Hidalgo, Isadora; Hauser, Lorenz
2016-01-01
Declines in size at age of Pacific halibut Hippoglossus stenolepis, in concert with sexually-dimorphic growth and a constant minimum commercial size limit, have led to the expectation that the sex composition of commercial catches should be increasingly female-biased. Sensitivity analyses suggest that variance in sex composition of landings may be the most influential source of uncertainty affecting current understanding of spawning stock biomass. However, there is no reliable way to determine sex at landing because all halibut are eviscerated at sea. In 2014, a statistical method based on survey data was developed to estimate the probability that fish of any given length at age (LAA) would be female, derived from the fundamental observation that large, young fish are likely female whereas small, old fish have a high probability of being male. Here, we examine variability in age-specific sex composition using at-sea commercial and closed-season survey catches, and compare the accuracy of the survey-based LAA technique to genetic markers for reconstructing the sex composition of catches. Sexing by LAA performed best for summer-collected samples, consistent with the hypothesis that the ability to characterize catches can be influenced by seasonal demographic shifts. Additionally, differences between survey and commercial selectivity that allow fishers to harvest larger fish within cohorts may generate important mismatch between survey and commercial datasets. Length-at-age-based estimates ranged from 4.7% underestimation of female proportion to 12.0% overestimation, with mean error of 5.8 ± 1.5%. Ratios determined by genetics were closer to true sample proportions and displayed less variability; estimation to within < 1% of true ratios was limited to genetics. Genetic estimation of female proportions ranged from 4.9% underestimation to 2.5% overestimation, with a mean absolute error of 1.2 ± 1.2%. Males were generally more difficult to assign than females: 6.7% of males and 3.4% of females were incorrectly assigned. Although nuclear microsatellites proved more consistent at partitioning catches by sex, we recommend that SNP assays be developed to allow for rapid, cost-effective, and accurate sex identification.
Estimation of density of mongooses with capture-recapture and distance sampling
Corn, J.L.; Conroy, M.J.
1998-01-01
We captured mongooses (Herpestes javanicus) in live traps arranged in trapping webs in Antigua, West Indies, and used capture-recapture and distance sampling to estimate density. Distance estimation and program DISTANCE were used to provide estimates of density from the trapping-web data. Mean density based on trapping webs was 9.5 mongooses/ha (range, 5.9-10.2/ha); estimates had coefficients of variation ranging from 29.82-31.58% (X?? = 30.46%). Mark-recapture models were used to estimate abundance, which was converted to density using estimates of effective trap area. Tests of model assumptions provided by CAPTURE indicated pronounced heterogeneity in capture probabilities and some indication of behavioral response and variation over time. Mean estimated density was 1.80 mongooses/ha (range, 1.37-2.15/ha) with estimated coefficients of variation of 4.68-11.92% (X?? = 7.46%). Estimates of density based on mark-recapture data depended heavily on assumptions about animal home ranges; variances of densities also may be underestimated, leading to unrealistically narrow confidence intervals. Estimates based on trap webs require fewer assumptions, and estimated variances may be a more realistic representation of sampling variation. Because trap webs are established easily and provide adequate data for estimation in a few sample occasions, the method should be efficient and reliable for estimating densities of mongooses.
A Study of the Southern Ocean: Mean State, Eddy Genesis & Demise, and Energy Pathways
NASA Astrophysics Data System (ADS)
Zajaczkovski, Uriel
The Southern Ocean (SO), due to its deep penetrating jets and eddies, is well-suited for studies that combine surface and sub-surface data. This thesis explores the use of Argo profiles and sea surface height ( SSH) altimeter data from a statistical point of view. A linear regression analysis of SSH and hydrographic data reveals that the altimeter can explain, on average, about 35% of the variance contained in the hydrographic fields and more than 95% if estimated locally. Correlation maxima are found at mid-depth, where dynamics are dominated by geostrophy. Near the surface, diabatic processes are significant, and the variance explained by the altimeter is lower. Since SSH variability is associated with eddies, the regression of SSH with temperature (T) and salinity (S) shows the relative importance of S vs T in controlling density anomalies. The AAIW salinity minimum separates two distinct regions; above the minimum density changes are dominated by T, while below the minimum S dominates over T. The regression analysis provides a method to remove eddy variability, effectively reducing the variance of the hydrographic fields. We use satellite altimetry and output from an assimilating numerical model to show that the SO has two distinct eddy motion regimes. North and south of the Antarctic Circumpolar Current (ACC), eddies propagate westward with a mean meridional drift directed poleward for cyclonic eddies (CEs) and equatorward for anticyclonic eddies (AEs). Eddies formed within the boundaries of the ACC have an effective eastward propagation with respect to the mean deep ACC flow, and the mean meridional drift is reversed, with warm-core AEs propagating poleward and cold-core CEs propagating equatorward. This circulation pattern drives downgradient eddy heat transport, which could potentially transport a significant fraction (24 to 60 x 1013 W) of the net poleward ACC eddy heat flux. We show that the generation of relatively large amplitude eddies is not a ubiquitous feature of the SO but rather a phenomenon that is constrained to five isolated, well-defined "hotspots". These hotspots are located downstream of major topographic features, with their boundaries closely following f/H contours. Eddies generated in these locations show no evidence of a bias in polarity and decay within the boundaries of the generation area. Eddies tend to disperse along f/H contours rather than following lines of latitude. We found enhanced values of both buoyancy (BP) and shear production (SP) inside the hotspots, with BP one order of magnitude larger than SP. This is consistent with baroclinic instability being the main mechanism of eddy generation. The mean potential density field estimated from Argo floats shows that inside the hotspots, isopycnal slopes are steep, indicating availability of potential energy. The hotspots identified in this thesis overlap with previously identified regions of standing meanders. We provide evidence that hotspot locations can be explained by the combined effect of topography, standing meanders that enhance baroclinic instability, and availability of potential energy to generate eddies via baroclinic instabilities.
Model for spectral and chromatographic data
Jarman, Kristin [Richland, WA; Willse, Alan [Richland, WA; Wahl, Karen [Richland, WA; Wahl, Jon [Richland, WA
2002-11-26
A method and apparatus using a spectral analysis technique are disclosed. In one form of the invention, probabilities are selected to characterize the presence (and in another form, also a quantification of a characteristic) of peaks in an indexed data set for samples that match a reference species, and other probabilities are selected for samples that do not match the reference species. An indexed data set is acquired for a sample, and a determination is made according to techniques exemplified herein as to whether the sample matches or does not match the reference species. When quantification of peak characteristics is undertaken, the model is appropriately expanded, and the analysis accounts for the characteristic model and data. Further techniques are provided to apply the methods and apparatuses to process control, cluster analysis, hypothesis testing, analysis of variance, and other procedures involving multiple comparisons of indexed data.
Msimanga, Huggins Z; Ollis, Robert J
2010-06-01
Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were used to classify acetaminophen-containing medicines using their attenuated total reflection Fourier transform infrared (ATR-FT-IR) spectra. Four formulations of Tylenol (Arthritis Pain Relief, Extra Strength Pain Relief, 8 Hour Pain Relief, and Extra Strength Pain Relief Rapid Release) along with 98% pure acetaminophen were selected for this study because of the similarity of their spectral features, with correlation coefficients ranging from 0.9857 to 0.9988. Before acquiring spectra for the predictor matrix, the effects on spectral precision with respect to sample particle size (determined by sieve size opening), force gauge of the ATR accessory, sample reloading, and between-tablet variation were examined. Spectra were baseline corrected and normalized to unity before multivariate analysis. Analysis of variance (ANOVA) was used to study spectral precision. The large particles (35 mesh) showed large variance between spectra, while fine particles (120 mesh) indicated good spectral precision based on the F-test. Force gauge setting did not significantly affect precision. Sample reloading using the fine particle size and a constant force gauge setting of 50 units also did not compromise precision. Based on these observations, data acquisition for the predictor matrix was carried out with the fine particles (sieve size opening of 120 mesh) at a constant force gauge setting of 50 units. After removing outliers, PCA successfully classified the five samples in the first and second components, accounting for 45.0% and 24.5% of the variances, respectively. The four-component PLS-DA model (R(2)=0.925 and Q(2)=0.906) gave good test spectra predictions with an overall average of 0.961 +/- 7.1% RSD versus the expected 1.0 prediction for the 20 test spectra used.
Newell, Felicity L.; Sheehan, James; Wood, Petra Bohall; Rodewald, Amanda D.; Buehler, David A.; Keyser, Patrick D.; Larkin, Jeffrey L.; Beachy, Tiffany A.; Bakermans, Marja H.; Boves, Than J.; Evans, Andrea; George, Gregory A.; McDermott, Molly E.; Perkins, Kelly A.; White, Matthew; Wigley, T. Bently
2013-01-01
Point counts are commonly used to assess changes in bird abundance, including analytical approaches such as distance sampling that estimate density. Point-count methods have come under increasing scrutiny because effects of detection probability and field error are difficult to quantify. For seven forest songbirds, we compared fixed-radii counts (50 m and 100 m) and density estimates obtained from distance sampling to known numbers of birds determined by territory mapping. We applied point-count analytic approaches to a typical forest management question and compared results to those obtained by territory mapping. We used a before–after control impact (BACI) analysis with a data set collected across seven study areas in the central Appalachians from 2006 to 2010. Using a 50-m fixed radius, variance in error was at least 1.5 times that of the other methods, whereas a 100-m fixed radius underestimated actual density by >3 territories per 10 ha for the most abundant species. Distance sampling improved accuracy and precision compared to fixed-radius counts, although estimates were affected by birds counted outside 10-ha units. In the BACI analysis, territory mapping detected an overall treatment effect for five of the seven species, and effects were generally consistent each year. In contrast, all point-count methods failed to detect two treatment effects due to variance and error in annual estimates. Overall, our results highlight the need for adequate sample sizes to reduce variance, and skilled observers to reduce the level of error in point-count data. Ultimately, the advantages and disadvantages of different survey methods should be considered in the context of overall study design and objectives, allowing for trade-offs among effort, accuracy, and power to detect treatment effects.
A powerful and flexible approach to the analysis of RNA sequence count data.
Zhou, Yi-Hui; Xia, Kai; Wright, Fred A
2011-10-01
A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean-variance relationships provides a flexible testing regimen that 'borrows' information across genes, while easily incorporating design effects and additional covariates. We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data and (ii) an extension of an expression mean-variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq yzhou@bios.unc.edu; fwright@bios.unc.edu Supplementary data are available at Bioinformatics online.
Powerful Statistical Inference for Nested Data Using Sufficient Summary Statistics
Dowding, Irene; Haufe, Stefan
2018-01-01
Hierarchically-organized data arise naturally in many psychology and neuroscience studies. As the standard assumption of independent and identically distributed samples does not hold for such data, two important problems are to accurately estimate group-level effect sizes, and to obtain powerful statistical tests against group-level null hypotheses. A common approach is to summarize subject-level data by a single quantity per subject, which is often the mean or the difference between class means, and treat these as samples in a group-level t-test. This “naive” approach is, however, suboptimal in terms of statistical power, as it ignores information about the intra-subject variance. To address this issue, we review several approaches to deal with nested data, with a focus on methods that are easy to implement. With what we call the sufficient-summary-statistic approach, we highlight a computationally efficient technique that can improve statistical power by taking into account within-subject variances, and we provide step-by-step instructions on how to apply this approach to a number of frequently-used measures of effect size. The properties of the reviewed approaches and the potential benefits over a group-level t-test are quantitatively assessed on simulated data and demonstrated on EEG data from a simulated-driving experiment. PMID:29615885
ESTIMATING LOW-FLOW FREQUENCIES OF UNGAGED STREAMS IN NEW ENGLAND.
Wandle, S. William
1987-01-01
Equations to estimate low flows were developed using multiple-regression analysis with a sample of 48 river basins, which were selected from the U. S. Geological Survey's network of gaged river basins in Massachusetts, New Hampshire, Rhode Island, Vermont, and southwestern Maine. Low-flow characteristics are represented by the 7Q2 and 7Q10 (the annual minimum 7-day mean low flow at the 2- and 10-year recurrence intervals). These statistics for each of the 48 basins were determined from a low-flow frequency analysis of streamflow records for 1942-71, or from a graphical or mathematical relationship if the record did not cover this 30-year period. Estimators for the mean and variance of the 7-day low flows at the index and short-term sites were used for two stations where discharge measurements of base flow were available and for two sites where the graphical technique was unsatisfactory.
Heritability of physical activity traits in Brazilian families: the Baependi Heart Study
2011-01-01
Background It is commonly recognized that physical activity has familial aggregation; however, the genetic influences on physical activity phenotypes are not well characterized. This study aimed to (1) estimate the heritability of physical activity traits in Brazilian families; and (2) investigate whether genetic and environmental variance components contribute differently to the expression of these phenotypes in males and females. Methods The sample that constitutes the Baependi Heart Study is comprised of 1,693 individuals in 95 Brazilian families. The phenotypes were self-reported in a questionnaire based on the WHO-MONICA instrument. Variance component approaches, implemented in the SOLAR (Sequential Oligogenic Linkage Analysis Routines) computer package, were applied to estimate the heritability and to evaluate the heterogeneity of variance components by gender on the studied phenotypes. Results The heritability estimates were intermediate (35%) for weekly physical activity among non-sedentary subjects (weekly PA_NS), and low (9-14%) for sedentarism, weekly physical activity (weekly PA), and level of daily physical activity (daily PA). Significant evidence for heterogeneity in variance components by gender was observed for the sedentarism and weekly PA phenotypes. No significant gender differences in genetic or environmental variance components were observed for the weekly PA_NS trait. The daily PA phenotype was predominantly influenced by environmental factors, with larger effects in males than in females. Conclusions Heritability estimates for physical activity phenotypes in this sample of the Brazilian population were significant in both males and females, and varied from low to intermediate magnitude. Significant evidence for heterogeneity in variance components by gender was observed. These data add to the knowledge of the physical activity traits in the Brazilian study population, and are concordant with the notion of significant biological determination in active behavior. PMID:22126647
Heritability of physical activity traits in Brazilian families: the Baependi Heart Study.
Horimoto, Andréa R V R; Giolo, Suely R; Oliveira, Camila M; Alvim, Rafael O; Soler, Júlia P; de Andrade, Mariza; Krieger, José E; Pereira, Alexandre C
2011-11-29
It is commonly recognized that physical activity has familial aggregation; however, the genetic influences on physical activity phenotypes are not well characterized. This study aimed to (1) estimate the heritability of physical activity traits in Brazilian families; and (2) investigate whether genetic and environmental variance components contribute differently to the expression of these phenotypes in males and females. The sample that constitutes the Baependi Heart Study is comprised of 1,693 individuals in 95 Brazilian families. The phenotypes were self-reported in a questionnaire based on the WHO-MONICA instrument. Variance component approaches, implemented in the SOLAR (Sequential Oligogenic Linkage Analysis Routines) computer package, were applied to estimate the heritability and to evaluate the heterogeneity of variance components by gender on the studied phenotypes. The heritability estimates were intermediate (35%) for weekly physical activity among non-sedentary subjects (weekly PA_NS), and low (9-14%) for sedentarism, weekly physical activity (weekly PA), and level of daily physical activity (daily PA). Significant evidence for heterogeneity in variance components by gender was observed for the sedentarism and weekly PA phenotypes. No significant gender differences in genetic or environmental variance components were observed for the weekly PA_NS trait. The daily PA phenotype was predominantly influenced by environmental factors, with larger effects in males than in females. Heritability estimates for physical activity phenotypes in this sample of the Brazilian population were significant in both males and females, and varied from low to intermediate magnitude. Significant evidence for heterogeneity in variance components by gender was observed. These data add to the knowledge of the physical activity traits in the Brazilian study population, and are concordant with the notion of significant biological determination in active behavior.
The minimum distance approach to classification
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1971-01-01
The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.
K-Fold Crossvalidation in Canonical Analysis.
ERIC Educational Resources Information Center
Liang, Kun-Hsia; And Others
1995-01-01
A computer-assisted, K-fold cross-validation technique is discussed in the framework of canonical correlation analysis of randomly generated data sets. Analysis results suggest that this technique can effectively reduce the contamination of canonical variates and canonical correlations by sample-specific variance components. (Author/SLD)
NASA Technical Reports Server (NTRS)
Craig, R. G. (Principal Investigator)
1983-01-01
Richmond, Virginia and Denver, Colorado were study sites in an effort to determine the effect of autocorrelation on the accuracy of a parallelopiped classifier of LANDSAT digital data. The autocorrelation was assumed to decay to insignificant levels when sampled at distances of at least ten pixels. Spectral themes developed using blocks of adjacent pixels, and using groups of pixels spaced at least 10 pixels apart were used. Effects of geometric distortions were minimized by using only pixels from the interiors of land cover sections. Accuracy was evaluated for three classes; agriculture, residential and "all other"; both type 1 and type 2 errors were evaluated by means of overall classification accuracy. All classes give comparable results. Accuracy is approximately the same in both techniques; however, the variance in accuracy is significantly higher using the themes developed from autocorrelated data. The vectors of mean spectral response were nearly identical regardless of sampling method used. The estimated variances were much larger when using autocorrelated pixels.
Qiu, Xing; Hu, Rui; Wu, Zhixin
2014-01-01
Normalization procedures are widely used in high-throughput genomic data analyses to remove various technological noise and variations. They are known to have profound impact to the subsequent gene differential expression analysis. Although there has been some research in evaluating different normalization procedures, few attempts have been made to systematically evaluate the gene detection performances of normalization procedures from the bias-variance trade-off point of view, especially with strong gene differentiation effects and large sample size. In this paper, we conduct a thorough study to evaluate the effects of normalization procedures combined with several commonly used statistical tests and MTPs under different configurations of effect size and sample size. We conduct theoretical evaluation based on a random effect model, as well as simulation and biological data analyses to verify the results. Based on our findings, we provide some practical guidance for selecting a suitable normalization procedure under different scenarios. PMID:24941114
Hidden Item Variance in Multiple Mini-Interview Scores
ERIC Educational Resources Information Center
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen
2017-01-01
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Gray, B.R.; Haro, R.J.; Rogala, J.T.; Sauer, J.S.
2005-01-01
1. Macroinvertebrate count data often exhibit nested or hierarchical structure. Examples include multiple measurements along each of a set of streams, and multiple synoptic measurements from each of a set of ponds. With data exhibiting hierarchical structure, outcomes at both sampling (e.g. Within stream) and aggregated (e.g. Stream) scales are often of interest. Unfortunately, methods for modelling hierarchical count data have received little attention in the ecological literature. 2. We demonstrate the use of hierarchical count models using fingernail clam (Family: Sphaeriidae) count data and habitat predictors derived from sampling and aggregated spatial scales. The sampling scale corresponded to that of a standard Ponar grab (0.052 m(2)) and the aggregated scale to impounded and backwater regions within 38-197 km reaches of the Upper Mississippi River. Impounded and backwater regions were resampled annually for 10 years. Consequently, measurements on clams were nested within years. Counts were treated as negative binomial random variates, and means from each resampling event as random departures from the impounded and backwater region grand means. 3. Clam models were improved by the addition of covariates that varied at both the sampling and regional scales. Substrate composition varied at the sampling scale and was associated with model improvements, and reductions (for a given mean) in variance at the sampling scale. Inorganic suspended solids (ISS) levels, measured in the summer preceding sampling, also yielded model improvements and were associated with reductions in variances at the regional rather than sampling scales. ISS levels were negatively associated with mean clam counts. 4. Hierarchical models allow hierarchically structured data to be modelled without ignoring information specific to levels of the hierarchy. In addition, information at each hierarchical level may be modelled as functions of covariates that themselves vary by and within levels. As a result, hierarchical models provide researchers and resource managers with a method for modelling hierarchical data that explicitly recognises both the sampling design and the information contained in the corresponding data.
Increased gender variance in autism spectrum disorders and attention deficit hyperactivity disorder.
Strang, John F; Kenworthy, Lauren; Dominska, Aleksandra; Sokoloff, Jennifer; Kenealy, Laura E; Berl, Madison; Walsh, Karin; Menvielle, Edgardo; Slesaransky-Poe, Graciela; Kim, Kyung-Eun; Luong-Tran, Caroline; Meagher, Haley; Wallace, Gregory L
2014-11-01
Evidence suggests over-representation of autism spectrum disorders (ASDs) and behavioral difficulties among people referred for gender issues, but rates of the wish to be the other gender (gender variance) among different neurodevelopmental disorders are unknown. This chart review study explored rates of gender variance as reported by parents on the Child Behavior Checklist (CBCL) in children with different neurodevelopmental disorders: ASD (N = 147, 24 females and 123 males), attention deficit hyperactivity disorder (ADHD; N = 126, 38 females and 88 males), or a medical neurodevelopmental disorder (N = 116, 57 females and 59 males), were compared with two non-referred groups [control sample (N = 165, 61 females and 104 males) and non-referred participants in the CBCL standardization sample (N = 1,605, 754 females and 851 males)]. Significantly greater proportions of participants with ASD (5.4%) or ADHD (4.8%) had parent reported gender variance than in the combined medical group (1.7%) or non-referred comparison groups (0-0.7%). As compared to non-referred comparisons, participants with ASD were 7.59 times more likely to express gender variance; participants with ADHD were 6.64 times more likely to express gender variance. The medical neurodevelopmental disorder group did not differ from non-referred samples in likelihood to express gender variance. Gender variance was related to elevated emotional symptoms in ADHD, but not in ASD. After accounting for sex ratio differences between the neurodevelopmental disorder and non-referred comparison groups, gender variance occurred equally in females and males.
Structural Studies of Amorphous Materials by Fluctuation Electron Microscopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treacy, Michael M. J.
Fluctuation Electron Microscopy (FEM) is a technique that examines the fluctuations in electron scattering across a uniformly thin amorphous sample. The statistics of the intensity fluctuations, mean and variance, reveal any underlying medium-range order present in the structure. The goals of this project were: (1) To determine the fundamentals of the scattering physics that gives rise to the variance signal in fluctuation electron microscopy (FEM); (2) To use these discoveries to find ways to quantify FEM; (3) To apply the FEM method to interesting and technologically important families of amorphous materials, particularly those with important applications in energy-related processes. Excellent progress was made in items (1) and (2). In stage (3) we did not examine the metamict zircons, as proposed. Instead, we examined films of polycrystalline and amorphous semi-conducting diamond. Significant accomplishments are: (1) A Reverse Monte Carlo procedure was successfully implemented to invert FEM data into a structural model. This is computer-intensive, but it demonstrated that diffraction and FEM data from amorphous silicon are most consistent with a paracrystallite model. This means that there is more diamond-like topology present in amorphous silicon than is predicted by the continuous random network model. (2) There is significant displacement decoherence arising in diffraction from amorphous silicon and carbon. The samples are being bombarded by the electron beam and atoms do not stay still while being irradiated – much more than was formerly understood. The atom motions cause the destructive and constructive interferences in the diffraction pattern to fluctuate with time, and it is the time-averaged speckle that is being measured. The variance is reduced by a factor m, 4 ≤ m ≤ 1000, relative to that predicted by kinematical scattering theory. (3) Speckle intensity obeys a gamma distribution, where the mean intensitymore » $$ \\overline{I}\\ $$ and m are the two parameters governing the shape of the gamma distribution profile. m is determined by the illumination spatial coherence, which is normally very high, and mostly by the displacement decoherence within the sample. (4) Amorphous materials are more affected by the electron beam than are crystalline materials. Different samples exhibit different disruptibility, as measured by the effective values of m that fit the data. (5) Understanding the origin of the displacement decoherence better should lead to efficient methods for computing the observed variance from amorphous materials.« less
Hsiao, C Y; Lan, C F; Chang, P L; Li, I C
2015-01-01
Our aim is to develop the psychometric property of the Minimum Data-Set-Based Depression Rating Scale (MDS-DRS) to ensure its use to assess service needs and guide care plans for institutionalized residents. 378 residents were recruited from the Haoran Senior Citizen Home in northern Taiwan. The MDS-DRS and GDS-SF were used to identify observable features of depression symptoms in the elderly residents. A total of 378 residents participated in this study. The receiver operating characteristic (ROC) curve indicated that the MDS-DRS has a 43.3% sensitivity and a 90.6% specificity when screening for depression symptoms. The total variance, explained by the two factors 'sadness' and 'distress,' was 58.1% based on the factor analysis. Reliable assessment tools for nurses are important because they allow the early detection of depression symptoms. The MDS-DRS items perform as well as the GDS-SF items in detecting depression symptoms. Furthermore, the MDS-DRS has the advantage of providing information to staff about care process implementation, which can facilitate the identification of areas that need improvement. Further research is needed to validate the use of the MDS-DRS in long-term care facilities.
High resolution beamforming on large aperture vertical line arrays: Processing synthetic data
NASA Astrophysics Data System (ADS)
Tran, Jean-Marie Q.; Hodgkiss, William S.
1990-09-01
This technical memorandum studies the beamforming of large aperture line arrays deployed vertically in the water column. The work concentrates on the use of high resolution techniques. Two processing strategies are envisioned: (1) full aperture coherent processing which offers in theory the best processing gain; and (2) subaperture processing which consists in extracting subapertures from the array and recombining the angular spectra estimated from these subarrays. The conventional beamformer, the minimum variance distortionless response (MVDR) processor, the multiple signal classification (MUSIC) algorithm and the minimum norm method are used in this study. To validate the various processing techniques, the ATLAS normal mode program is used to generate synthetic data which constitute a realistic signals environment. A deep-water, range-independent sound velocity profile environment, characteristic of the North-East Pacific, is being studied for two different 128 sensor arrays: a very long one cut for 30 Hz and operating at 20 Hz; and a shorter one cut for 107 Hz and operating at 100 Hz. The simulated sound source is 5 m deep. The full aperture and subaperture processing are being implemented with curved and plane wavefront replica vectors. The beamforming results are examined and compared to the ray-theory results produced by the generic sonar model.
NASA Technical Reports Server (NTRS)
Vasquez, Bernard J.; Farrugia, Charles J.; Markovskii, Sergei A.; Hollweg, Joseph V.; Richardson, Ian G.; Ogilvie, Keith W.; Lepping, Ronald P.; Lin, Robert P.; Larson, Davin; White, Nicholas E. (Technical Monitor)
2001-01-01
A solar ejection passed the Wind spacecraft between December 23 and 26, 1996. On closer examination, we find a sequence of ejecta material, as identified by abnormally low proton temperatures, separated by plasmas with typical solar wind temperatures at 1 AU. Large and abrupt changes in field and plasma properties occurred near the separation boundaries of these regions. At the one boundary we examine here, a series of directional discontinuities was observed. We argue that Alfvenic fluctuations in the immediate vicinity of these discontinuities distort minimum variance normals, introducing uncertainty into the identification of the discontinuities as either rotational or tangential. Carrying out a series of tests on plasma and field data including minimum variance, velocity and magnetic field correlations, and jump conditions, we conclude that the discontinuities are tangential. Furthermore, we find waves superposed on these tangential discontinuities (TDs). The presence of discontinuities allows the existence of both surface waves and ducted body waves. Both probably form in the solar atmosphere where many transverse nonuniformities exist and where theoretically they have been expected. We add to prior speculation that waves on discontinuities may in fact be a common occurrence. In the solar wind, these waves can attain large amplitudes and low frequencies. We argue that such waves can generate dynamical changes at TDs through advection or forced reconnection. The dynamics might so extensively alter the internal structure that the discontinuity would no longer be identified as tangential. Such processes could help explain why the occurrence frequency of TDs observed throughout the solar wind falls off with increasing heliocentric distance. The presence of waves may also alter the nature of the interactions of TDs with the Earth's bow shock in so-called hot flow anomalies.
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
NASA Astrophysics Data System (ADS)
Oware, E. K.
2017-12-01
Geophysical quantification of hydrogeological parameters typically involve limited noisy measurements coupled with inadequate understanding of the target phenomenon. Hence, a deterministic solution is unrealistic in light of the largely uncertain inputs. Stochastic imaging (SI), in contrast, provides multiple equiprobable realizations that enable probabilistic assessment of aquifer properties in a realistic manner. Generation of geologically realistic prior models is central to SI frameworks. Higher-order statistics for representing prior geological features in SI are, however, usually borrowed from training images (TIs), which may produce undesirable outcomes if the TIs are unpresentatitve of the target structures. The Markov random field (MRF)-based SI strategy provides a data-driven alternative to TI-based SI algorithms. In the MRF-based method, the simulation of spatial features is guided by Gibbs energy (GE) minimization. Local configurations with smaller GEs have higher likelihood of occurrence and vice versa. The parameters of the Gibbs distribution for computing the GE are estimated from the hydrogeophysical data, thereby enabling the generation of site-specific structures in the absence of reliable TIs. In Metropolis-like SI methods, the variance of the transition probability controls the jump-size. The procedure is a standard Markov chain Monte Carlo (McMC) method when a constant variance is assumed, and becomes simulated annealing (SA) when the variance (cooling temperature) is allowed to decrease gradually with time. We observe that in certain problems, the large variance typically employed at the beginning to hasten burn-in may be unideal for sampling at the equilibrium state. The powerfulness of SA stems from its flexibility to adaptively scale the variance at different stages of the sampling. Degeneration of results were reported in a previous implementation of the MRF-based SI strategy based on a constant variance. Here, we present an updated version of the algorithm based on SA that appears to resolve the degeneration problem with seemingly improved results. We illustrate the performance of the SA version with a joint inversion of time-lapse concentration and electrical resistivity measurements in a hypothetical trinary hydrofacies aquifer characterization problem.
Theory of Financial Risk and Derivative Pricing
NASA Astrophysics Data System (ADS)
Bouchaud, Jean-Philippe; Potters, Marc
2009-01-01
Foreword; Preface; 1. Probability theory: basic notions; 2. Maximum and addition of random variables; 3. Continuous time limit, Ito calculus and path integrals; 4. Analysis of empirical data; 5. Financial products and financial markets; 6. Statistics of real prices: basic results; 7. Non-linear correlations and volatility fluctuations; 8. Skewness and price-volatility correlations; 9. Cross-correlations; 10. Risk measures; 11. Extreme correlations and variety; 12. Optimal portfolios; 13. Futures and options: fundamental concepts; 14. Options: hedging and residual risk; 15. Options: the role of drift and correlations; 16. Options: the Black and Scholes model; 17. Options: some more specific problems; 18. Options: minimum variance Monte-Carlo; 19. The yield curve; 20. Simple mechanisms for anomalous price statistics; Index of most important symbols; Index.
Theory of Financial Risk and Derivative Pricing - 2nd Edition
NASA Astrophysics Data System (ADS)
Bouchaud, Jean-Philippe; Potters, Marc
2003-12-01
Foreword; Preface; 1. Probability theory: basic notions; 2. Maximum and addition of random variables; 3. Continuous time limit, Ito calculus and path integrals; 4. Analysis of empirical data; 5. Financial products and financial markets; 6. Statistics of real prices: basic results; 7. Non-linear correlations and volatility fluctuations; 8. Skewness and price-volatility correlations; 9. Cross-correlations; 10. Risk measures; 11. Extreme correlations and variety; 12. Optimal portfolios; 13. Futures and options: fundamental concepts; 14. Options: hedging and residual risk; 15. Options: the role of drift and correlations; 16. Options: the Black and Scholes model; 17. Options: some more specific problems; 18. Options: minimum variance Monte-Carlo; 19. The yield curve; 20. Simple mechanisms for anomalous price statistics; Index of most important symbols; Index.
Transition of Attention in Terminal Area NextGen Operations Using Synthetic Vision Systems
NASA Technical Reports Server (NTRS)
Ellis, Kyle K. E.; Kramer, Lynda J.; Shelton, Kevin J.; Arthur, Shelton, J. J., III; Prinzel, Lance J., III; Norman, Robert M.
2011-01-01
This experiment investigates the capability of Synthetic Vision Systems (SVS) to provide significant situation awareness in terminal area operations, specifically in low visibility conditions. The use of a Head-Up Display (HUD) and Head-Down Displays (HDD) with SVS is contrasted to baseline standard head down displays in terms of induced workload and pilot behavior in 1400 RVR visibility levels. Variances across performance and pilot behavior were reviewed for acceptability when using HUD or HDD with SVS under reduced minimums to acquire the necessary visual components to continue to land. The data suggest superior performance for HUD implementations. Improved attentional behavior is also suggested for HDD implementations of SVS for low-visibility approach and landing operations.
Empirical single sample quantification of bias and variance in Q-ball imaging.
Hainline, Allison E; Nath, Vishwesh; Parvathaneni, Prasanna; Blaber, Justin A; Schilling, Kurt G; Anderson, Adam W; Kang, Hakmook; Landman, Bennett A
2018-02-06
The bias and variance of high angular resolution diffusion imaging methods have not been thoroughly explored in the literature and may benefit from the simulation extrapolation (SIMEX) and bootstrap techniques to estimate bias and variance of high angular resolution diffusion imaging metrics. The SIMEX approach is well established in the statistics literature and uses simulation of increasingly noisy data to extrapolate back to a hypothetical case with no noise. The bias of calculated metrics can then be computed by subtracting the SIMEX estimate from the original pointwise measurement. The SIMEX technique has been studied in the context of diffusion imaging to accurately capture the bias in fractional anisotropy measurements in DTI. Herein, we extend the application of SIMEX and bootstrap approaches to characterize bias and variance in metrics obtained from a Q-ball imaging reconstruction of high angular resolution diffusion imaging data. The results demonstrate that SIMEX and bootstrap approaches provide consistent estimates of the bias and variance of generalized fractional anisotropy, respectively. The RMSE for the generalized fractional anisotropy estimates shows a 7% decrease in white matter and an 8% decrease in gray matter when compared with the observed generalized fractional anisotropy estimates. On average, the bootstrap technique results in SD estimates that are approximately 97% of the true variation in white matter, and 86% in gray matter. Both SIMEX and bootstrap methods are flexible, estimate population characteristics based on single scans, and may be extended for bias and variance estimation on a variety of high angular resolution diffusion imaging metrics. © 2018 International Society for Magnetic Resonance in Medicine.
Chaudhuri, Shomesh E; Merfeld, Daniel M
2013-03-01
Psychophysics generally relies on estimating a subject's ability to perform a specific task as a function of an observed stimulus. For threshold studies, the fitted functions are called psychometric functions. While fitting psychometric functions to data acquired using adaptive sampling procedures (e.g., "staircase" procedures), investigators have encountered a bias in the spread ("slope" or "threshold") parameter that has been attributed to the serial dependency of the adaptive data. Using simulations, we confirm this bias for cumulative Gaussian parametric maximum likelihood fits on data collected via adaptive sampling procedures, and then present a bias-reduced maximum likelihood fit that substantially reduces the bias without reducing the precision of the spread parameter estimate and without reducing the accuracy or precision of the other fit parameters. As a separate topic, we explain how to implement this bias reduction technique using generalized linear model fits as well as other numeric maximum likelihood techniques such as the Nelder-Mead simplex. We then provide a comparison of the iterative bootstrap and observed information matrix techniques for estimating parameter fit variance from adaptive sampling procedure data sets. The iterative bootstrap technique is shown to be slightly more accurate; however, the observed information technique executes in a small fraction (0.005 %) of the time required by the iterative bootstrap technique, which is an advantage when a real-time estimate of parameter fit variance is required.
NASA Technical Reports Server (NTRS)
Jacobson, R. A.
1975-01-01
Difficulties arise in guiding a solar electric propulsion spacecraft due to nongravitational accelerations caused by random fluctuations in the magnitude and direction of the thrust vector. These difficulties may be handled by using a low thrust guidance law based on the linear-quadratic-Gaussian problem of stochastic control theory with a minimum terminal miss performance criterion. Explicit constraints are imposed on the variances of the control parameters, and an algorithm based on the Hilbert space extension of a parameter optimization method is presented for calculation of gains in the guidance law. The terminal navigation of a 1980 flyby mission to the comet Encke is used as an example.
The mean and variance of phylogenetic diversity under rarefaction
Matsen, Frederick A.
2013-01-01
Summary Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required. PMID:23833701
The mean and variance of phylogenetic diversity under rarefaction.
Nipperess, David A; Matsen, Frederick A
2013-06-01
Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required.
Filtering Drifter Trajectories Sampled at Submesoscale Resolution
2015-05-11
bution. The modification allows PE variances to change in time through parameterization (10) with the cutoff scale 1.56 10 ms , corresponding to the...more sophisticated real-time ionosphere correction algo- rithms. We consider this study to be an attempt at improving the quality of the drifter data
Dimensionality of Women's Career Orientation.
ERIC Educational Resources Information Center
Marshall, Sandra J.; Wijting, Jan P.
1982-01-01
Factor analysis of data from two samples identified nine indices of women's career orientation. Two factors accounted for significant variance common to the indices: career centeredness, which reflects the importance attached to a career relative to other life activities, and career commitment, which implies a commitment to lifetime employment.…
Effect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data.
Li, Johnson Ching-Hong
2016-12-01
In psychological science, the "new statistics" refer to the new statistical practices that focus on effect size (ES) evaluation instead of conventional null-hypothesis significance testing (Cumming, Psychological Science, 25, 7-29, 2014). In a two-independent-samples scenario, Cohen's (1988) standardized mean difference (d) is the most popular ES, but its accuracy relies on two assumptions: normality and homogeneity of variances. Five other ESs-the unscaled robust d (d r * ; Hogarty & Kromrey, 2001), scaled robust d (d r ; Algina, Keselman, & Penfield, Psychological Methods, 10, 317-328, 2005), point-biserial correlation (r pb ; McGrath & Meyer, Psychological Methods, 11, 386-401, 2006), common-language ES (CL; Cliff, Psychological Bulletin, 114, 494-509, 1993), and nonparametric estimator for CL (A w ; Ruscio, Psychological Methods, 13, 19-30, 2008)-may be robust to violations of these assumptions, but no study has systematically evaluated their performance. Thus, in this simulation study the performance of these six ESs was examined across five factors: data distribution, sample, base rate, variance ratio, and sample size. The results showed that A w and d r were generally robust to these violations, and A w slightly outperformed d r . Implications for the use of A w and d r in real-world research are discussed.
Sethuraman, Kavita; Lansdown, Richard; Sullivan, Keith
2006-06-01
Moderate malnutrition continues to affect 46% of children under five years of age and 47% of rural women in India. Women's lack of empowerment is believed to be an important factor in the persistent prevalence of malnutrition. In India, women's empowerment often varies by community, with tribes sometimes being the most progressive. To explore the relationship between women's empowerment, maternal nutritional status, and the nutritional status of their children aged 6 to 24 months in rural and tribal communities. This study in rural Karnataka, India, included tribal and rural subjects and used both qualitative and quantitative methods of data collection. Structured interviews with mothers were performed and anthropometric measurements were obtained for 820 mother-child pairs. The data were analyzed by multivariate and logistic regression. Some degree of malnutrition was seen in 83.5% of children and 72.4% of mothers in the sample. Biological variables explained most of the variance in nutritional status, followed by health-care seeking and women's empowerment variables; socioeconomic variables explained the least amount of variance. Women's empowerment variables were significantly associated with child nutrition and explained 5.6% of the variance in the sample. Maternal experience of psychological abuse and sexual coercion increased the risk of malnutrition in mothers and children. Domestic violence was experienced by 34% of mothers in the sample. In addition to the known investments needed to reduce malnutrition, improving women's nutrition, promoting gender equality, empowering women, and ending violence against women could further reduce the prevalence of malnutrition in this segment of the Indian population.
NASA Astrophysics Data System (ADS)
Gao, Jing; Burt, James E.
2017-12-01
This study investigates the usefulness of a per-pixel bias-variance error decomposition (BVD) for understanding and improving spatially-explicit data-driven models of continuous variables in environmental remote sensing (ERS). BVD is a model evaluation method originated from machine learning and have not been examined for ERS applications. Demonstrated with a showcase regression tree model mapping land imperviousness (0-100%) using Landsat images, our results showed that BVD can reveal sources of estimation errors, map how these sources vary across space, reveal the effects of various model characteristics on estimation accuracy, and enable in-depth comparison of different error metrics. Specifically, BVD bias maps can help analysts identify and delineate model spatial non-stationarity; BVD variance maps can indicate potential effects of ensemble methods (e.g. bagging), and inform efficient training sample allocation - training samples should capture the full complexity of the modeled process, and more samples should be allocated to regions with more complex underlying processes rather than regions covering larger areas. Through examining the relationships between model characteristics and their effects on estimation accuracy revealed by BVD for both absolute and squared errors (i.e. error is the absolute or the squared value of the difference between observation and estimate), we found that the two error metrics embody different diagnostic emphases, can lead to different conclusions about the same model, and may suggest different solutions for performance improvement. We emphasize BVD's strength in revealing the connection between model characteristics and estimation accuracy, as understanding this relationship empowers analysts to effectively steer performance through model adjustments.
Applications of non-parametric statistics and analysis of variance on sample variances
NASA Technical Reports Server (NTRS)
Myers, R. H.
1981-01-01
Nonparametric methods that are available for NASA-type applications are discussed. An attempt will be made here to survey what can be used, to attempt recommendations as to when each would be applicable, and to compare the methods, when possible, with the usual normal-theory procedures that are avavilable for the Gaussion analog. It is important here to point out the hypotheses that are being tested, the assumptions that are being made, and limitations of the nonparametric procedures. The appropriateness of doing analysis of variance on sample variances are also discussed and studied. This procedure is followed in several NASA simulation projects. On the surface this would appear to be reasonably sound procedure. However, difficulties involved center around the normality problem and the basic homogeneous variance assumption that is mase in usual analysis of variance problems. These difficulties discussed and guidelines given for using the methods.
Amberg, Alexander; Barrett, Dave; Beale, Michael H.; Beger, Richard; Daykin, Clare A.; Fan, Teresa W.-M.; Fiehn, Oliver; Goodacre, Royston; Griffin, Julian L.; Hankemeier, Thomas; Hardy, Nigel; Harnly, James; Higashi, Richard; Kopka, Joachim; Lane, Andrew N.; Lindon, John C.; Marriott, Philip; Nicholls, Andrew W.; Reily, Michael D.; Thaden, John J.; Viant, Mark R.
2013-01-01
There is a general consensus that supports the need for standardized reporting of metadata or information describing large-scale metabolomics and other functional genomics data sets. Reporting of standard metadata provides a biological and empirical context for the data, facilitates experimental replication, and enables the re-interrogation and comparison of data by others. Accordingly, the Metabolomics Standards Initiative is building a general consensus concerning the minimum reporting standards for metabolomics experiments of which the Chemical Analysis Working Group (CAWG) is a member of this community effort. This article proposes the minimum reporting standards related to the chemical analysis aspects of metabolomics experiments including: sample preparation, experimental analysis, quality control, metabolite identification, and data pre-processing. These minimum standards currently focus mostly upon mass spectrometry and nuclear magnetic resonance spectroscopy due to the popularity of these techniques in metabolomics. However, additional input concerning other techniques is welcomed and can be provided via the CAWG on-line discussion forum at http://msi-workgroups.sourceforge.net/ or http://Msi-workgroups-feedback@lists.sourceforge.net. Further, community input related to this document can also be provided via this electronic forum. PMID:24039616
1984-05-01
By means of the concept of change-of variance function we investigate the stability properties of the asymptotic variance of R-estimators. This allows us to construct the optimal V-robust R-estimator that minimizes the asymptotic variance at the model, under the side condition of a bounded change-of variance function. Finally, we discuss the connection between this function and an influence function for two-sample rank tests introduced by Eplett (1980). (Author)
Evaluation of sampling plans to detect Cry9C protein in corn flour and meal.
Whitaker, Thomas B; Trucksess, Mary W; Giesbrecht, Francis G; Slate, Andrew B; Thomas, Francis S
2004-01-01
StarLink is a genetically modified corn that produces an insecticidal protein, Cry9C. Studies were conducted to determine the variability and Cry9C distribution among sample test results when Cry9C protein was estimated in a bulk lot of corn flour and meal. Emphasis was placed on measuring sampling and analytical variances associated with each step of the test procedure used to measure Cry9C in corn flour and meal. Two commercially available enzyme-linked immunosorbent assay kits were used: one for the determination of Cry9C protein concentration and the other for % StarLink seed. The sampling and analytical variances associated with each step of the Cry9C test procedures were determined for flour and meal. Variances were found to be functions of Cry9C concentration, and regression equations were developed to describe the relationships. Because of the larger particle size, sampling variability associated with cornmeal was about double that for corn flour. For cornmeal, the sampling variance accounted for 92.6% of the total testing variability. The observed sampling and analytical distributions were compared with the Normal distribution. In almost all comparisons, the null hypothesis that the Cry9C protein values were sampled from a Normal distribution could not be rejected at 95% confidence limits. The Normal distribution and the variance estimates were used to evaluate the performance of several Cry9C protein sampling plans for corn flour and meal. Operating characteristic curves were developed and used to demonstrate the effect of increasing sample size on reducing false positives (seller's risk) and false negatives (buyer's risk).
Model determination in a case of heterogeneity of variance using sampling techniques.
Varona, L; Moreno, C; Garcia-Cortes, L A; Altarriba, J
1997-01-12
A sampling determination procedure has been described in a case of heterogeneity of variance. The procedure makes use of the predictive distributions of each data given the rest of the data and the structure of the assumed model. The computation of these predictive distributions is carried out using a Gibbs Sampling procedure. The final criterion to compare between models is the Mean Square Error between the expectation of predictive distributions and real data. The procedure has been applied to a data set of weight at 210 days in the Spanish Pirenaica beef cattle breed. Three proposed models have been compared: (a) Single Trait Animal Model; (b) Heterogeneous Variance Animal Model; and (c) Multiple Trait Animal Model. After applying the procedure, the most adjusted model was the Heterogeneous Variance Animal Model. This result is probably due to a compromise between the complexity of the model and the amount of available information. The estimated heritabilities under the preferred model have been 0.489 ± 0.076 for males and 0.331 ± 0.082 for females. RESUMEN: Contraste de modelos en un caso de heterogeneidad de varianzas usando métodos de muestreo Se ha descrito un método de contraste de modelos mediante técnicas de muestreo en un caso de heterogeneidad de varianza entre sexos. El procedimiento utiliza las distribucviones predictivas de cada dato, dado el resto de datos y la estructura del modelo. El criterio para coparar modelos es el error cuadrático medio entre la esperanza de las distribuciones predictivas y los datos reales. El procedimiento se ha aplicado en datos de peso a los 210 días en la raza bovina Pirenaica. Se han propuesto tres posibles modelos: (a) Modelo Animal Unicaracter; (b) Modelo Animal con Varianzas Heterogéneas; (c) Modelo Animal Multicaracter. El modelo mejor ajustado fue el Modelo Animal con Varianzas Heterogéneas. Este resultado es probablemente debido a un compromiso entre la complejidad del modelo y la cantidad de datos disponibles. Las heredabilidades estimadas bajo el modelo preferido han sido 0,489 ± 0,076 en los machos y 0,331 ± 0,082 en las hembras. 1997 Blackwell Verlag GmbH.
NASA Technical Reports Server (NTRS)
Lugo, Rafael A.; Tolson, Robert H.; Schoenenberger, Mark
2013-01-01
As part of the Mars Science Laboratory (MSL) trajectory reconstruction effort at NASA Langley Research Center, free-flight aeroballistic experiments of instrumented MSL scale models was conducted at Aberdeen Proving Ground in Maryland. The models carried an inertial measurement unit (IMU) and a flush air data system (FADS) similar to the MSL Entry Atmospheric Data System (MEADS) that provided data types similar to those from the MSL entry. Multiple sources of redundant data were available, including tracking radar and on-board magnetometers. These experimental data enabled the testing and validation of the various tools and methodologies that will be used for MSL trajectory reconstruction. The aerodynamic parameters Mach number, angle of attack, and sideslip angle were estimated using minimum variance with a priori to combine the pressure data and pre-flight computational fluid dynamics (CFD) data. Both linear and non-linear pressure model terms were also estimated for each pressure transducer as a measure of the errors introduced by CFD and transducer calibration. Parameter uncertainties were estimated using a "consider parameters" approach.
Piazza, Alexander M; Binversie, Emily E; Baker, Lauren A; Nemke, Brett; Sample, Susannah J; Muir, Peter
2017-04-01
OBJECTIVE To determine whether walking at specific ranges of absolute and relative (V*) velocity would aid efficient capture of gait trial data with low ground reaction force (GRF) variance in a heterogeneous sample of dogs. ANIMALS 17 clinically normal dogs of various breeds, ages, and sexes. PROCEDURES Each dog was walked across a force platform at its preferred velocity, with controlled acceleration within 0.5 m/s 2 . Ranges in V* were created for height at the highest point of the shoulders (withers; WHV*). Variance effects from 8 walking absolute velocity ranges and associated WHV* ranges were examined by means of repeated-measures ANCOVA. RESULTS The individual dog effect provided the greatest contribution to variance. Narrow velocity ranges typically resulted in capture of a smaller percentage of valid trials and were not consistently associated with lower variance. The WHV* range of 0.33 to 0.46 allowed capture of valid trials efficiently, with no significant effects on peak vertical force and vertical impulse. CONCLUSIONS AND CLINICAL RELEVANCE Dogs with severe lameness may be unable to trot or may have a decline in mobility with gait trial repetition. Gait analysis involving evaluation of individual dogs at their preferred absolute velocity, such that dogs are evaluated at a similar V*, may facilitate efficient capture of valid trials without significant effects on GRF. Use of individual velocity ranges derived from a WHV* range of 0.33 to 0.46 can account for heterogeneity and appears suitable for use in clinical trials involving dogs at a walking gait.
On the significance of δ13C correlations in ancient sediments
NASA Astrophysics Data System (ADS)
Derry, Louis A.
2010-08-01
A graphical analysis of the correlations between δc and ɛTOC was introduced by Rothman et al. (2003) to obtain estimates of the carbon isotopic composition of inputs to the oceans and the organic carbon burial fraction. Applied to Cenozoic data, the method agrees with independent estimates, but with Neoproterozoic data the method yields results that cannot be accommodated with standard models of sedimentary carbon isotope mass balance. We explore the sensitivity of the graphical correlation method and find that the variance ratio between δc and δo is an important control on the correlation of δc and ɛ. If the variance ratio σc/ σo ≥ 1 highly correlated arrays very similar to those obtained from the data are produced from independent random variables. The Neoproterozoic data shows such variance patterns, and the regression parameters for the Neoproterozoic data are statistically indistinguishable from the randomized model at the 95% confidence interval. The projection of the data into δc- ɛ space cannot distinguish between signal and noise, such as post-depositional alteration, under these circumstances. There appears to be no need to invoke unusual carbon cycle dynamics to explain the Neoproterozoic δc- ɛ array. The Cenozoic data have σc/ σo < 1 and the δc vs. ɛ correlation is probably geologically significant, but the analyzed sample size is too small to yield statistically significant results.
NASA Astrophysics Data System (ADS)
Chappell, N. A.; Jones, T.; Young, P.; Krishnaswamy, J.
2015-12-01
There is increasing awareness that under-sampling may have resulted in the omission of important physicochemical information present in water quality signatures of surface waters - thereby affecting interpretation of biogeochemical processes. For dissolved organic carbon (DOC) and nitrogen this under-sampling can now be avoided using UV-visible spectroscopy measured in-situ and continuously at a fine-resolution e.g. 15 minutes ("real time"). Few methods are available to extract biogeochemical process information directly from such high-frequency data. Jones, Chappell & Tych (2014 Environ Sci Technol: 13289-97) developed one such method using optically-derived DOC data based upon a sophisticated time-series modelling tool. Within this presentation we extend the methodology to quantify the minimum sampling interval required to avoid distortion of model structures and parameters that describe fundamental biogeochemical processes. This shifting of parameters which results from under-sampling is called "aliasing". We demonstrate that storm dynamics at a variety of sites dominate over diurnal and seasonal changes and that these must be characterised by sampling that may be sub-hourly to avoid aliasing. This is considerably shorter than that used by other water quality studies examining aliasing (e.g. Kirchner 2005 Phys Rev: 069902). The modelling approach presented is being developed into a generic tool to calculate the minimum sampling for water quality monitoring in systems driven primarily by hydrology. This is illustrated with fine-resolution, optical data from watersheds in temperate Europe through to the humid tropics.
Du, Gang; Jiang, Zhibin; Diao, Xiaodi; Yao, Yang
2013-07-01
Takagi-Sugeno (T-S) fuzzy neural networks (FNNs) can be used to handle complex, fuzzy, uncertain clinical pathway (CP) variances. However, there are many drawbacks, such as slow training rate, propensity to become trapped in a local minimum and poor ability to perform a global search. In order to improve overall performance of variance handling by T-S FNNs, a new CP variance handling method is proposed in this study. It is based on random cooperative decomposing particle swarm optimization with double mutation mechanism (RCDPSO_DM) for T-S FNNs. Moreover, the proposed integrated learning algorithm, combining the RCDPSO_DM algorithm with a Kalman filtering algorithm, is applied to optimize antecedent and consequent parameters of constructed T-S FNNs. Then, a multi-swarm cooperative immigrating particle swarm algorithm ensemble method is used for intelligent ensemble T-S FNNs with RCDPSO_DM optimization to further improve stability and accuracy of CP variance handling. Finally, two case studies on liver and kidney poisoning variances in osteosarcoma preoperative chemotherapy are used to validate the proposed method. The result demonstrates that intelligent ensemble T-S FNNs based on the RCDPSO_DM achieves superior performances, in terms of stability, efficiency, precision and generalizability, over PSO ensemble of all T-S FNNs with RCDPSO_DM optimization, single T-S FNNs with RCDPSO_DM optimization, standard T-S FNNs, standard Mamdani FNNs and T-S FNNs based on other algorithms (cooperative particle swarm optimization and particle swarm optimization) for CP variance handling. Therefore, it makes CP variance handling more effective. Copyright © 2013 Elsevier Ltd. All rights reserved.
2012-09-01
by the ARL Translational Neuroscience Branch. It covers the Emotiv EPOC,6 Advanced Brain Monitoring (ABM) B-Alert X10,7 Quasar 8 DSI helmet-based...Systems; ARL-TR-5945; U.S. Army Research Laboratory: Aberdeen Proving Ground, MD, 2012 4 Ibid. 5 Ibid. 6 EPOC is a trademark of Emotiv . 7 B
ERIC Educational Resources Information Center
Johnson, Jim
2017-01-01
A growing number of U.S. business schools now offer an undergraduate degree in international business (IB), for which training in a foreign language is a requirement. However, there appears to be considerable variance in the minimum requirements for foreign language training across U.S. business schools, including the provision of…
Climatological variables and the incidence of Dengue fever in Barbados.
Depradine, Colin; Lovell, Ernest
2004-12-01
A retrospective study to determine relationships between the incidence of dengue cases and climatological variables and to obtain a predictive equation was carried out for the relatively small Caribbean island of Barbados which is divided into 11 parishes. The study used the weekly dengue cases and precipitation data for the years (1995 - 2000) that occurred in the small area of a single parish. Other climatological data were obtained from the local meteorological offices. The study used primarily cross correlation analysis and found the strongest correlation with the vapour pressure at a lag of 6 weeks. A weaker correlation occurred at a lag of 7 weeks for the precipitation. The minimum temperature had its strongest correlation at a lag of 12 weeks and the maximum temperature a lag of 16 weeks. There was a negative correlation with the wind speed at a lag of 3 weeks. The predictive models showed a maximum explained variance of 35%.
Darzi, Soodabeh; Tiong, Sieh Kiong; Tariqul Islam, Mohammad; Rezai Soleymanpour, Hassan; Kibria, Salehin
2016-01-01
An experience oriented-convergence improved gravitational search algorithm (ECGSA) based on two new modifications, searching through the best experiments and using of a dynamic gravitational damping coefficient (α), is introduced in this paper. ECGSA saves its best fitness function evaluations and uses those as the agents' positions in searching process. In this way, the optimal found trajectories are retained and the search starts from these trajectories, which allow the algorithm to avoid the local optimums. Also, the agents can move faster in search space to obtain better exploration during the first stage of the searching process and they can converge rapidly to the optimal solution at the final stage of the search process by means of the proposed dynamic gravitational damping coefficient. The performance of ECGSA has been evaluated by applying it to eight standard benchmark functions along with six complicated composite test functions. It is also applied to adaptive beamforming problem as a practical issue to improve the weight vectors computed by minimum variance distortionless response (MVDR) beamforming technique. The results of implementation of the proposed algorithm are compared with some well-known heuristic methods and verified the proposed method in both reaching to optimal solutions and robustness.
NASA Astrophysics Data System (ADS)
Zhou, Ming; Wu, Jianyang; Xu, Xiaoyi; Mu, Xin; Dou, Yunping
2018-02-01
In order to obtain improved electrical discharge machining (EDM) performance, we have dedicated more than a decade to correcting one essential EDM defect, the weak stability of the machining, by developing adaptive control systems. The instabilities of machining are mainly caused by complicated disturbances in discharging. To counteract the effects from the disturbances on machining, we theoretically developed three control laws from minimum variance (MV) control law to minimum variance and pole placements coupled (MVPPC) control law and then to a two-step-ahead prediction (TP) control law. Based on real-time estimation of EDM process model parameters and measured ratio of arcing pulses which is also called gap state, electrode discharging cycle was directly and adaptively tuned so that a stable machining could be achieved. To this end, we not only theoretically provide three proved control laws for a developed EDM adaptive control system, but also practically proved the TP control law to be the best in dealing with machining instability and machining efficiency though the MVPPC control law provided much better EDM performance than the MV control law. It was also shown that the TP control law also provided a burn free machining.
Multivariate Models of Men's and Women's Partner Aggression
ERIC Educational Resources Information Center
O'Leary, K. Daniel; Smith Slep, Amy M.; O'Leary, Susan G.
2007-01-01
This exploratory study was designed to address how multiple factors drawn from varying focal models and ecological levels of influence might operate relative to each other to predict partner aggression, using data from 453 representatively sampled couples. The resulting cross-validated models predicted approximately 50% of the variance in men's…
The Coopersmith Self-Esteem Inventory in an Adult Sample.
ERIC Educational Resources Information Center
Noller, Patricia; Shugm, David
1988-01-01
The reliability and validity of the Self-Esteem Inventory developed by S. C. Coopersmith (1975) were evaluated via item-total correlation, discriminant analysis, factor analysis, and analysis of variance of data for 352 Australian adults. The instrument had high internal consistency and discriminated well between subjects with high and low…
Genetic and Environmental Contributions to Educational Attainment in Australia.
ERIC Educational Resources Information Center
Miller, Paul; Mulvey, Charles; Martin, Nick
2001-01-01
Data from a large sample of Australian twins indicate that 50 to 65 percent of variance in educational attainments can be attributed to genetic endowments. Only about 25 to 40 percent may be due to environmental factors, depending on adjustments for measurement error and assortative mating. (Contains 51 references.) (MLH)
ERIC Educational Resources Information Center
Bui, Hoan N.
2009-01-01
This study examines delinquent behavior among schoolchildren in a nationally representative sample from the United States and seeks an understanding of the factors contributing to variances in delinquency across immigration generations. Data analysis indicates that the levels of self-reported substance use, property delinquency, and violent…
Chambaz, Antoine; Zheng, Wenjing; van der Laan, Mark J
2017-01-01
This article studies the targeted sequential inference of an optimal treatment rule (TR) and its mean reward in the non-exceptional case, i.e. , assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal TR. This data-adaptive statistical parameter is worthy of interest on its own. Our main result is a central limit theorem which enables the construction of confidence intervals on both mean rewards under the current estimate of the optimal TR and under the optimal TR itself. The asymptotic variance of the estimator takes the form of the variance of an efficient influence curve at a limiting distribution, allowing to discuss the efficiency of inference. As a by product, we also derive confidence intervals on two cumulated pseudo-regrets, a key notion in the study of bandits problems. A simulation study illustrates the procedure. One of the corner-stones of the theoretical study is a new maximal inequality for martingales with respect to the uniform entropy integral.
Engelmann Spruce Site Index Models: A Comparison of Model Functions and Parameterizations
Nigh, Gordon
2015-01-01
Engelmann spruce (Picea engelmannii Parry ex Engelm.) is a high-elevation species found in western Canada and western USA. As this species becomes increasingly targeted for harvesting, better height growth information is required for good management of this species. This project was initiated to fill this need. The objective of the project was threefold: develop a site index model for Engelmann spruce; compare the fits and modelling and application issues between three model formulations and four parameterizations; and more closely examine the grounded-Generalized Algebraic Difference Approach (g-GADA) model parameterization. The model fitting data consisted of 84 stem analyzed Engelmann spruce site trees sampled across the Engelmann Spruce – Subalpine Fir biogeoclimatic zone. The fitted models were based on the Chapman-Richards function, a modified Hossfeld IV function, and the Schumacher function. The model parameterizations that were tested are indicator variables, mixed-effects, GADA, and g-GADA. Model evaluation was based on the finite-sample corrected version of Akaike’s Information Criteria and the estimated variance. Model parameterization had more of an influence on the fit than did model formulation, with the indicator variable method providing the best fit, followed by the mixed-effects modelling (9% increase in the variance for the Chapman-Richards and Schumacher formulations over the indicator variable parameterization), g-GADA (optimal approach) (335% increase in the variance), and the GADA/g-GADA (with the GADA parameterization) (346% increase in the variance). Factors related to the application of the model must be considered when selecting the model for use as the best fitting methods have the most barriers in their application in terms of data and software requirements. PMID:25853472
Parajulee, M N; Shrestha, R B; Leser, J F
2006-04-01
A 2-yr field study was conducted to examine the effectiveness of two sampling methods (visual and plant washing techniques) for western flower thrips, Frankliniella occidentalis (Pergande), and five sampling methods (visual, beat bucket, drop cloth, sweep net, and vacuum) for cotton fleahopper, Pseudatomoscelis seriatus (Reuter), in Texas cotton, Gossypium hirsutum (L.), and to develop sequential sampling plans for each pest. The plant washing technique gave similar results to the visual method in detecting adult thrips, but the washing technique detected significantly higher number of thrips larvae compared with the visual sampling. Visual sampling detected the highest number of fleahoppers followed by beat bucket, drop cloth, vacuum, and sweep net sampling, with no significant difference in catch efficiency between vacuum and sweep net methods. However, based on fixed precision cost reliability, the sweep net sampling was the most cost-effective method followed by vacuum, beat bucket, drop cloth, and visual sampling. Taylor's Power Law analysis revealed that the field dispersion patterns of both thrips and fleahoppers were aggregated throughout the crop growing season. For thrips management decision based on visual sampling (0.25 precision), 15 plants were estimated to be the minimum sample size when the estimated population density was one thrips per plant, whereas the minimum sample size was nine plants when thrips density approached 10 thrips per plant. The minimum visual sample size for cotton fleahoppers was 16 plants when the density was one fleahopper per plant, but the sample size decreased rapidly with an increase in fleahopper density, requiring only four plants to be sampled when the density was 10 fleahoppers per plant. Sequential sampling plans were developed and validated with independent data for both thrips and cotton fleahoppers.