Sample records for unbiased parameter estimates

  1. Uncertainty relation based on unbiased parameter estimations

    NASA Astrophysics Data System (ADS)

    Sun, Liang-Liang; Song, Yong-Shun; Qiao, Cong-Feng; Yu, Sixia; Chen, Zeng-Bing

    2017-02-01

    Heisenberg's uncertainty relation has been extensively studied in spirit of its well-known original form, in which the inaccuracy measures used exhibit some controversial properties and don't conform with quantum metrology, where the measurement precision is well defined in terms of estimation theory. In this paper, we treat the joint measurement of incompatible observables as a parameter estimation problem, i.e., estimating the parameters characterizing the statistics of the incompatible observables. Our crucial observation is that, in a sequential measurement scenario, the bias induced by the first unbiased measurement in the subsequent measurement can be eradicated by the information acquired, allowing one to extract unbiased information of the second measurement of an incompatible observable. In terms of Fisher information we propose a kind of information comparison measure and explore various types of trade-offs between the information gains and measurement precisions, which interpret the uncertainty relation as surplus variance trade-off over individual perfect measurements instead of a constraint on extracting complete information of incompatible observables.

  2. Minimum mean squared error (MSE) adjustment and the optimal Tykhonov-Phillips regularization parameter via reproducing best invariant quadratic uniformly unbiased estimates (repro-BIQUUE)

    NASA Astrophysics Data System (ADS)

    Schaffrin, Burkhard

    2008-02-01

    In a linear Gauss-Markov model, the parameter estimates from BLUUE (Best Linear Uniformly Unbiased Estimate) are not robust against possible outliers in the observations. Moreover, by giving up the unbiasedness constraint, the mean squared error (MSE) risk may be further reduced, in particular when the problem is ill-posed. In this paper, the α-weighted S-homBLE (Best homogeneously Linear Estimate) is derived via formulas originally used for variance component estimation on the basis of the repro-BIQUUE (reproducing Best Invariant Quadratic Uniformly Unbiased Estimate) principle in a model with stochastic prior information. In the present model, however, such prior information is not included, which allows the comparison of the stochastic approach (α-weighted S-homBLE) with the well-established algebraic approach of Tykhonov-Phillips regularization, also known as R-HAPS (Hybrid APproximation Solution), whenever the inverse of the “substitute matrix” S exists and is chosen as the R matrix that defines the relative impact of the regularizing term on the final result.

  3. Control system estimation and design for aerospace vehicles

    NASA Technical Reports Server (NTRS)

    Stefani, R. T.; Williams, T. L.; Yakowitz, S. J.

    1972-01-01

    The selection of an estimator which is unbiased when applied to structural parameter estimation is discussed. The mathematical relationships for structural parameter estimation are defined. It is shown that a conventional weighted least squares (CWLS) estimate is biased when applied to structural parameter estimation. Two approaches to bias removal are suggested: (1) change the CWLS estimator or (2) change the objective function. The advantages of each approach are analyzed.

  4. Diallel analysis for sex-linked and maternal effects.

    PubMed

    Zhu, J; Weir, B S

    1996-01-01

    Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.

  5. An Example of an Improvable Rao-Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator.

    PubMed

    Galili, Tal; Meilijson, Isaac

    2016-01-02

    The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].

  6. Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information

    NASA Technical Reports Server (NTRS)

    Howell, L. W., Jr.

    2003-01-01

    A simple power law model consisting of a single spectral index, sigma(sub 2), is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index sigma(sub 2) greater than sigma(sub 1) above E(sub k). The maximum likelihood (ML) procedure was developed for estimating the single parameter sigma(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (Pl) consistency (asymptotically unbiased), (P2) efficiency (asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only be ascertained by calculating the CRB for an assumed energy spectrum- detector response function combination, which can be quite formidable in practice. However, the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are stained in practice are investigated.

  7. Mixed model approaches for diallel analysis based on a bio-model.

    PubMed

    Zhu, J; Weir, B S

    1996-12-01

    A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.

  8. Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information

    NASA Technical Reports Server (NTRS)

    Howell, L. W.

    2002-01-01

    A simple power law model consisting of a single spectral index, a is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index alpha(sub 2) greater than alpha(sub 1) above E(sub k). The Maximum likelihood (ML) procedure was developed for estimating the single parameter alpha(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (P1) consistency (asymptotically unbiased). (P2) efficiency asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only he ascertained by calculating the CRB for an assumed energy spectrum-detector response function combination, which can be quite formidable in practice. However. the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are attained in practice are investigated. The ML technique is then extended to estimate spectra information from an arbitrary number of astrophysics data sets produced by vastly different science instruments. This theory and its successful implementation will facilitate the interpretation of spectral information from multiple astrophysics missions and thereby permit the derivation of superior spectral parameter estimates based on the combination of data sets.

  9. Estimation of genetic parameters and response to selection for a continuous trait subject to culling before testing.

    PubMed

    Arnason, T; Albertsdóttir, E; Fikse, W F; Eriksson, S; Sigurdsson, A

    2012-02-01

    The consequences of assuming a zero environmental covariance between a binary trait 'test-status' and a continuous trait on the estimates of genetic parameters by restricted maximum likelihood and Gibbs sampling and on response from genetic selection when the true environmental covariance deviates from zero were studied. Data were simulated for two traits (one that culling was based on and a continuous trait) using the following true parameters, on the underlying scale: h² = 0.4; r(A) = 0.5; r(E) = 0.5, 0.0 or -0.5. The selection on the continuous trait was applied to five subsequent generations where 25 sires and 500 dams produced 1500 offspring per generation. Mass selection was applied in the analysis of the effect on estimation of genetic parameters. Estimated breeding values were used in the study of the effect of genetic selection on response and accuracy. The culling frequency was either 0.5 or 0.8 within each generation. Each of 10 replicates included 7500 records on 'test-status' and 9600 animals in the pedigree file. Results from bivariate analysis showed unbiased estimates of variance components and genetic parameters when true r(E) = 0.0. For r(E) = 0.5, variance components (13-19% bias) and especially (50-80%) were underestimated for the continuous trait, while heritability estimates were unbiased. For r(E) = -0.5, heritability estimates of test-status were unbiased, while genetic variance and heritability of the continuous trait together with were overestimated (25-50%). The bias was larger for the higher culling frequency. Culling always reduced genetic progress from selection, but the genetic progress was found to be robust to the use of wrong parameter values of the true environmental correlation between test-status and the continuous trait. Use of a bivariate linear-linear model reduced bias in genetic evaluations, when data were subject to culling. © 2011 Blackwell Verlag GmbH.

  10. A Bayesian approach to parameter and reliability estimation in the Poisson distribution.

    NASA Technical Reports Server (NTRS)

    Canavos, G. C.

    1972-01-01

    For life testing procedures, a Bayesian analysis is developed with respect to a random intensity parameter in the Poisson distribution. Bayes estimators are derived for the Poisson parameter and the reliability function based on uniform and gamma prior distributions of that parameter. A Monte Carlo procedure is implemented to make possible an empirical mean-squared error comparison between Bayes and existing minimum variance unbiased, as well as maximum likelihood, estimators. As expected, the Bayes estimators have mean-squared errors that are appreciably smaller than those of the other two.

  11. Unbiased multi-fidelity estimate of failure probability of a free plane jet

    NASA Astrophysics Data System (ADS)

    Marques, Alexandre; Kramer, Boris; Willcox, Karen; Peherstorfer, Benjamin

    2017-11-01

    Estimating failure probability related to fluid flows is a challenge because it requires a large number of evaluations of expensive models. We address this challenge by leveraging multiple low fidelity models of the flow dynamics to create an optimal unbiased estimator. In particular, we investigate the effects of uncertain inlet conditions in the width of a free plane jet. We classify a condition as failure when the corresponding jet width is below a small threshold, such that failure is a rare event (failure probability is smaller than 0.001). We estimate failure probability by combining the frameworks of multi-fidelity importance sampling and optimal fusion of estimators. Multi-fidelity importance sampling uses a low fidelity model to explore the parameter space and create a biasing distribution. An unbiased estimate is then computed with a relatively small number of evaluations of the high fidelity model. In the presence of multiple low fidelity models, this framework offers multiple competing estimators. Optimal fusion combines all competing estimators into a single estimator with minimal variance. We show that this combined framework can significantly reduce the cost of estimating failure probabilities, and thus can have a large impact in fluid flow applications. This work was funded by DARPA.

  12. A Test-Length Correction to the Estimation of Extreme Proficiency Levels

    ERIC Educational Resources Information Center

    Magis, David; Beland, Sebastien; Raiche, Gilles

    2011-01-01

    In this study, the estimation of extremely large or extremely small proficiency levels, given the item parameters of a logistic item response model, is investigated. On one hand, the estimation of proficiency levels by maximum likelihood (ML), despite being asymptotically unbiased, may yield infinite estimates. On the other hand, with an…

  13. Comparison of estimators of standard deviation for hydrologic time series

    USGS Publications Warehouse

    Tasker, Gary D.; Gilroy, Edward J.

    1982-01-01

    Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.

  14. Parameter estimation in linear models of the human operator in a closed loop with application of deterministic test signals

    NASA Technical Reports Server (NTRS)

    Vanlunteren, A.; Stassen, H. G.

    1973-01-01

    Parameter estimation techniques are discussed with emphasis on unbiased estimates in the presence of noise. A distinction between open and closed loop systems is made. A method is given based on the application of external forcing functions consisting of a sun of sinusoids; this method is thus based on the estimation of Fourier coefficients and is applicable for models with poles and zeros in open and closed loop systems.

  15. Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.

    PubMed

    Xie, Yanmei; Zhang, Biao

    2017-04-20

    Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).

  16. Smoothed Biasing Forces Yield Unbiased Free Energies with the Extended-System Adaptive Biasing Force Method

    PubMed Central

    2016-01-01

    We report a theoretical description and numerical tests of the extended-system adaptive biasing force method (eABF), together with an unbiased estimator of the free energy surface from eABF dynamics. Whereas the original ABF approach uses its running estimate of the free energy gradient as the adaptive biasing force, eABF is built on the idea that the exact free energy gradient is not necessary for efficient exploration, and that it is still possible to recover the exact free energy separately with an appropriate estimator. eABF does not directly bias the collective coordinates of interest, but rather fictitious variables that are harmonically coupled to them; therefore is does not require second derivative estimates, making it easily applicable to a wider range of problems than ABF. Furthermore, the extended variables present a smoother, coarse-grain-like sampling problem on a mollified free energy surface, leading to faster exploration and convergence. We also introduce CZAR, a simple, unbiased free energy estimator from eABF trajectories. eABF/CZAR converges to the physical free energy surface faster than standard ABF for a wide range of parameters. PMID:27959559

  17. An empirical Bayes approach for the Poisson life distribution.

    NASA Technical Reports Server (NTRS)

    Canavos, G. C.

    1973-01-01

    A smooth empirical Bayes estimator is derived for the intensity parameter (hazard rate) in the Poisson distribution as used in life testing. The reliability function is also estimated either by using the empirical Bayes estimate of the parameter, or by obtaining the expectation of the reliability function. The behavior of the empirical Bayes procedure is studied through Monte Carlo simulation in which estimates of mean-squared errors of the empirical Bayes estimators are compared with those of conventional estimators such as minimum variance unbiased or maximum likelihood. Results indicate a significant reduction in mean-squared error of the empirical Bayes estimators over the conventional variety.

  18. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations.

    PubMed

    Liu, Dajiang J; Leal, Suzanne M

    2012-10-05

    Next-generation sequencing has led to many complex-trait rare-variant (RV) association studies. Although single-variant association analysis can be performed, it is grossly underpowered. Therefore, researchers have developed many RV association tests that aggregate multiple variant sites across a genetic region (e.g., gene), and test for the association between the trait and the aggregated genotype. After these aggregate tests detect an association, it is only possible to estimate the average genetic effect for a group of RVs. As a result of the "winner's curse," such an estimate can be biased. Although for common variants one can obtain unbiased estimates of genetic parameters by analyzing a replication sample, for RVs it is desirable to obtain unbiased genetic estimates for the study where the association is identified. This is because there can be substantial heterogeneity of RV sites and frequencies even among closely related populations. In order to obtain an unbiased estimate for aggregated RV analysis, we developed bootstrap-sample-split algorithms to reduce the bias of the winner's curse. The unbiased estimates are greatly important for understanding the population-specific contribution of RVs to the heritability of complex traits. We also demonstrate both theoretically and via simulations that for aggregate RV analysis the genetic variance for a gene or region will always be underestimated, sometimes substantially, because of the presence of noncausal variants or because of the presence of causal variants with effects of different magnitudes or directions. Therefore, even if RVs play a major role in the complex-trait etiologies, a portion of the heritability will remain missing, and the contribution of RVs to the complex-trait etiologies will be underestimated. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  19. A novel SURE-based criterion for parametric PSF estimation.

    PubMed

    Xue, Feng; Blu, Thierry

    2015-02-01

    We propose an unbiased estimate of a filtered version of the mean squared error--the blur-SURE (Stein's unbiased risk estimate)--as a novel criterion for estimating an unknown point spread function (PSF) from the degraded image only. The PSF is obtained by minimizing this new objective functional over a family of Wiener processings. Based on this estimated blur kernel, we then perform nonblind deconvolution using our recently developed algorithm. The SURE-based framework is exemplified with a number of parametric PSF, involving a scaling factor that controls the blur size. A typical example of such parametrization is the Gaussian kernel. The experimental results demonstrate that minimizing the blur-SURE yields highly accurate estimates of the PSF parameters, which also result in a restoration quality that is very similar to the one obtained with the exact PSF, when plugged into our recent multi-Wiener SURE-LET deconvolution algorithm. The highly competitive results obtained outline the great potential of developing more powerful blind deconvolution algorithms based on SURE-like estimates.

  20. Effects of sampling close relatives on some elementary population genetics analyses.

    PubMed

    Wang, Jinliang

    2018-01-01

    Many molecular ecology analyses assume the genotyped individuals are sampled at random from a population and thus are representative of the population. Realistically, however, a sample may contain excessive close relatives (ECR) because, for example, localized juveniles are drawn from fecund species. Our knowledge is limited about how ECR affect the routinely conducted elementary genetics analyses, and how ECR are best dealt with to yield unbiased and accurate parameter estimates. This study quantifies the effects of ECR on some popular population genetics analyses of marker data, including the estimation of allele frequencies, F-statistics, expected heterozygosity (H e ), effective and observed numbers of alleles, and the tests of Hardy-Weinberg equilibrium (HWE) and linkage equilibrium (LE). It also investigates several strategies for handling ECR to mitigate their impact and to yield accurate parameter estimates. My analytical work, assisted by simulations, shows that ECR have large and global effects on all of the above marker analyses. The naïve approach of simply ignoring ECR could yield low-precision and often biased parameter estimates, and could cause too many false rejections of HWE and LE. The bold approach, which simply identifies and removes ECR, and the cautious approach, which estimates target parameters (e.g., H e ) by accounting for ECR and using naïve allele frequency estimates, eliminate the bias and the false HWE and LE rejections, but could reduce estimation precision substantially. The likelihood approach, which accounts for ECR in estimating allele frequencies and thus target parameters relying on allele frequencies, usually yields unbiased and the most accurate parameter estimates. Which of the four approaches is the most effective and efficient may depend on the particular marker analysis to be conducted. The results are discussed in the context of using marker data for understanding population properties and marker properties. © 2017 John Wiley & Sons Ltd.

  1. Epidemiologic Evaluation of Measurement Data in the Presence of Detection Limits

    PubMed Central

    Lubin, Jay H.; Colt, Joanne S.; Camann, David; Davis, Scott; Cerhan, James R.; Severson, Richard K.; Bernstein, Leslie; Hartge, Patricia

    2004-01-01

    Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of “fill-in” values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5–10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case–control study of non-Hodgkin lymphoma. PMID:15579415

  2. Class Enumeration and Parameter Recovery of Growth Mixture Modeling and Second-Order Growth Mixture Modeling in the Presence of Measurement Noninvariance between Latent Classes

    PubMed Central

    Kim, Eun Sook; Wang, Yan

    2017-01-01

    Population heterogeneity in growth trajectories can be detected with growth mixture modeling (GMM). It is common that researchers compute composite scores of repeated measures and use them as multiple indicators of growth factors (baseline performance and growth) assuming measurement invariance between latent classes. Considering that the assumption of measurement invariance does not always hold, we investigate the impact of measurement noninvariance on class enumeration and parameter recovery in GMM through a Monte Carlo simulation study (Study 1). In Study 2, we examine the class enumeration and parameter recovery of the second-order growth mixture modeling (SOGMM) that incorporates measurement models at the first order level. Thus, SOGMM estimates growth trajectory parameters with reliable sources of variance, that is, common factor variance of repeated measures and allows heterogeneity in measurement parameters between latent classes. The class enumeration rates are examined with information criteria such as AIC, BIC, sample-size adjusted BIC, and hierarchical BIC under various simulation conditions. The results of Study 1 showed that the parameter estimates of baseline performance and growth factor means were biased to the degree of measurement noninvariance even when the correct number of latent classes was extracted. In Study 2, the class enumeration accuracy of SOGMM depended on information criteria, class separation, and sample size. The estimates of baseline performance and growth factor mean differences between classes were generally unbiased but the size of measurement noninvariance was underestimated. Overall, SOGMM is advantageous in that it yields unbiased estimates of growth trajectory parameters and more accurate class enumeration compared to GMM by incorporating measurement models. PMID:28928691

  3. Maximum likelihood estimation for life distributions with competing failure modes

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1979-01-01

    Systems which are placed on test at time zero, function for a period and die at some random time were studied. Failure may be due to one of several causes or modes. The parameters of the life distribution may depend upon the levels of various stress variables the item is subject to. Maximum likelihood estimation methods are discussed. Specific methods are reported for the smallest extreme-value distributions of life. Monte-Carlo results indicate the methods to be promising. Under appropriate conditions, the location parameters are nearly unbiased, the scale parameter is slight biased, and the asymptotic covariances are rapidly approached.

  4. Joint modelling compared with two stage methods for analysing longitudinal data and prospective outcomes: A simulation study of childhood growth and BP.

    PubMed

    Sayers, A; Heron, J; Smith, Adac; Macdonald-Wallis, C; Gilthorpe, M S; Steele, F; Tilling, K

    2017-02-01

    There is a growing debate with regards to the appropriate methods of analysis of growth trajectories and their association with prospective dependent outcomes. Using the example of childhood growth and adult BP, we conducted an extensive simulation study to explore four two-stage and two joint modelling methods, and compared their bias and coverage in estimation of the (unconditional) association between birth length and later BP, and the association between growth rate and later BP (conditional on birth length). We show that the two-stage method of using multilevel models to estimate growth parameters and relating these to outcome gives unbiased estimates of the conditional associations between growth and outcome. Using simulations, we demonstrate that the simple methods resulted in bias in the presence of measurement error, as did the two-stage multilevel method when looking at the total (unconditional) association of birth length with outcome. The two joint modelling methods gave unbiased results, but using the re-inflated residuals led to undercoverage of the confidence intervals. We conclude that either joint modelling or the simpler two-stage multilevel approach can be used to estimate conditional associations between growth and later outcomes, but that only joint modelling is unbiased with nominal coverage for unconditional associations.

  5. Epidemiologic research using probabilistic outcome definitions.

    PubMed

    Cai, Bing; Hennessy, Sean; Lo Re, Vincent; Small, Dylan S

    2015-01-01

    Epidemiologic studies using electronic healthcare data often define the presence or absence of binary clinical outcomes by using algorithms with imperfect specificity, sensitivity, and positive predictive value. This results in misclassification and bias in study results. We describe and evaluate a new method called probabilistic outcome definition (POD) that uses logistic regression to estimate the probability of a clinical outcome using multiple potential algorithms and then uses multiple imputation to make valid inferences about the risk ratio or other epidemiologic parameters of interest. We conducted a simulation to evaluate the performance of the POD method with two variables that can predict the true outcome and compared the POD method with the conventional method. The simulation results showed that when the true risk ratio is equal to 1.0 (null), the conventional method based on a binary outcome provides unbiased estimates. However, when the risk ratio is not equal to 1.0, the traditional method, either using one predictive variable or both predictive variables to define the outcome, is biased when the positive predictive value is <100%, and the bias is very severe when the sensitivity or positive predictive value is poor (less than 0.75 in our simulation). In contrast, the POD method provides unbiased estimates of the risk ratio both when this measure of effect is equal to 1.0 and not equal to 1.0. Even when the sensitivity and positive predictive value are low, the POD method continues to provide unbiased estimates of the risk ratio. The POD method provides an improved way to define outcomes in database research. This method has a major advantage over the conventional method in that it provided unbiased estimates of risk ratios and it is easy to use. Copyright © 2014 John Wiley & Sons, Ltd.

  6. Unbiasedness

    USGS Publications Warehouse

    Link, W.A.; Armitage, Peter; Colton, Theodore

    1998-01-01

    Unbiasedness is probably the best known criterion for evaluating the performance of estimators. This note describes unbiasedness, demonstrating various failings of the criterion. It is shown that unbiased estimators might not exist, or might not be unique; an example of a unique but clearly unacceptable unbiased estimator is given. It is shown that unbiased estimators are not translation invariant. Various alternative criteria are described, and are illustrated through examples.

  7. OPTIMAL EXPERIMENT DESIGN FOR MAGNETIC RESONANCE FINGERPRINTING

    PubMed Central

    Zhao, Bo; Haldar, Justin P.; Setsompop, Kawin; Wald, Lawrence L.

    2017-01-01

    Magnetic resonance (MR) fingerprinting is an emerging quantitative MR imaging technique that simultaneously acquires multiple tissue parameters in an efficient experiment. In this work, we present an estimation-theoretic framework to evaluate and design MR fingerprinting experiments. More specifically, we derive the Cramér-Rao bound (CRB), a lower bound on the covariance of any unbiased estimator, to characterize parameter estimation for MR fingerprinting. We then formulate an optimal experiment design problem based on the CRB to choose a set of acquisition parameters (e.g., flip angles and/or repetition times) that maximizes the signal-to-noise ratio efficiency of the resulting experiment. The utility of the proposed approach is validated by numerical studies. Representative results demonstrate that the optimized experiments allow for substantial reduction in the length of an MR fingerprinting acquisition, and substantial improvement in parameter estimation performance. PMID:28268369

  8. Optimal experiment design for magnetic resonance fingerprinting.

    PubMed

    Bo Zhao; Haldar, Justin P; Setsompop, Kawin; Wald, Lawrence L

    2016-08-01

    Magnetic resonance (MR) fingerprinting is an emerging quantitative MR imaging technique that simultaneously acquires multiple tissue parameters in an efficient experiment. In this work, we present an estimation-theoretic framework to evaluate and design MR fingerprinting experiments. More specifically, we derive the Cramér-Rao bound (CRB), a lower bound on the covariance of any unbiased estimator, to characterize parameter estimation for MR fingerprinting. We then formulate an optimal experiment design problem based on the CRB to choose a set of acquisition parameters (e.g., flip angles and/or repetition times) that maximizes the signal-to-noise ratio efficiency of the resulting experiment. The utility of the proposed approach is validated by numerical studies. Representative results demonstrate that the optimized experiments allow for substantial reduction in the length of an MR fingerprinting acquisition, and substantial improvement in parameter estimation performance.

  9. Elimination of trait blocks from multiple trait mixed model equations with singular (Co)variance parameter matrices

    USDA-ARS?s Scientific Manuscript database

    Transformations to multiple trait mixed model equations (MME) which are intended to improve computational efficiency in best linear unbiased prediction (BLUP) and restricted maximum likelihood (REML) are described. It is shown that traits that are expected or estimated to have zero residual variance...

  10. On the robustness of a Bayes estimate. [in reliability theory

    NASA Technical Reports Server (NTRS)

    Canavos, G. C.

    1974-01-01

    This paper examines the robustness of a Bayes estimator with respect to the assigned prior distribution. A Bayesian analysis for a stochastic scale parameter of a Weibull failure model is summarized in which the natural conjugate is assigned as the prior distribution of the random parameter. The sensitivity analysis is carried out by the Monte Carlo method in which, although an inverted gamma is the assigned prior, realizations are generated using distribution functions of varying shape. For several distributional forms and even for some fixed values of the parameter, simulated mean squared errors of Bayes and minimum variance unbiased estimators are determined and compared. Results indicate that the Bayes estimator remains squared-error superior and appears to be largely robust to the form of the assigned prior distribution.

  11. An unbiased risk estimator for image denoising in the presence of mixed poisson-gaussian noise.

    PubMed

    Le Montagner, Yoann; Angelini, Elsa D; Olivo-Marin, Jean-Christophe

    2014-03-01

    The behavior and performance of denoising algorithms are governed by one or several parameters, whose optimal settings depend on the content of the processed image and the characteristics of the noise, and are generally designed to minimize the mean squared error (MSE) between the denoised image returned by the algorithm and a virtual ground truth. In this paper, we introduce a new Poisson-Gaussian unbiased risk estimator (PG-URE) of the MSE applicable to a mixed Poisson-Gaussian noise model that unifies the widely used Gaussian and Poisson noise models in fluorescence bioimaging applications. We propose a stochastic methodology to evaluate this estimator in the case when little is known about the internal machinery of the considered denoising algorithm, and we analyze both theoretically and empirically the characteristics of the PG-URE estimator. Finally, we evaluate the PG-URE-driven parametrization for three standard denoising algorithms, with and without variance stabilizing transforms, and different characteristics of the Poisson-Gaussian noise mixture.

  12. A Comparison of Methods for a Priori Bias Correction in Soil Moisture Data Assimilation

    NASA Technical Reports Server (NTRS)

    Kumar, Sujay V.; Reichle, Rolf H.; Harrison, Kenneth W.; Peters-Lidard, Christa D.; Yatheendradas, Soni; Santanello, Joseph A.

    2011-01-01

    Data assimilation is being increasingly used to merge remotely sensed land surface variables such as soil moisture, snow and skin temperature with estimates from land models. Its success, however, depends on unbiased model predictions and unbiased observations. Here, a suite of continental-scale, synthetic soil moisture assimilation experiments is used to compare two approaches that address typical biases in soil moisture prior to data assimilation: (i) parameter estimation to calibrate the land model to the climatology of the soil moisture observations, and (ii) scaling of the observations to the model s soil moisture climatology. To enable this research, an optimization infrastructure was added to the NASA Land Information System (LIS) that includes gradient-based optimization methods and global, heuristic search algorithms. The land model calibration eliminates the bias but does not necessarily result in more realistic model parameters. Nevertheless, the experiments confirm that model calibration yields assimilation estimates of surface and root zone soil moisture that are as skillful as those obtained through scaling of the observations to the model s climatology. Analysis of innovation diagnostics underlines the importance of addressing bias in soil moisture assimilation and confirms that both approaches adequately address the issue.

  13. An Unbiased Estimator of Gene Diversity with Improved Variance for Samples Containing Related and Inbred Individuals of any Ploidy

    PubMed Central

    Harris, Alexandre M.; DeGiorgio, Michael

    2016-01-01

    Gene diversity, or expected heterozygosity (H), is a common statistic for assessing genetic variation within populations. Estimation of this statistic decreases in accuracy and precision when individuals are related or inbred, due to increased dependence among allele copies in the sample. The original unbiased estimator of expected heterozygosity underestimates true population diversity in samples containing relatives, as it only accounts for sample size. More recently, a general unbiased estimator of expected heterozygosity was developed that explicitly accounts for related and inbred individuals in samples. Though unbiased, this estimator’s variance is greater than that of the original estimator. To address this issue, we introduce a general unbiased estimator of gene diversity for samples containing related or inbred individuals, which employs the best linear unbiased estimator of allele frequencies, rather than the commonly used sample proportion. We examine the properties of this estimator, H∼BLUE, relative to alternative estimators using simulations and theoretical predictions, and show that it predominantly has the smallest mean squared error relative to others. Further, we empirically assess the performance of H∼BLUE on a global human microsatellite dataset of 5795 individuals, from 267 populations, genotyped at 645 loci. Additionally, we show that the improved variance of H∼BLUE leads to improved estimates of the population differentiation statistic, FST, which employs measures of gene diversity within its calculation. Finally, we provide an R script, BestHet, to compute this estimator from genomic and pedigree data. PMID:28040781

  14. Using object-based image analysis to guide the selection of field sample locations

    USDA-ARS?s Scientific Manuscript database

    One of the most challenging tasks for resource management and research is designing field sampling schemes to achieve unbiased estimates of ecosystem parameters as efficiently as possible. This study focused on the potential of fine-scale image objects from object-based image analysis (OBIA) to be u...

  15. A Comparison of Latent Growth Models for Constructs Measured by Multiple Items

    ERIC Educational Resources Information Center

    Leite, Walter L.

    2007-01-01

    Univariate latent growth modeling (LGM) of composites of multiple items (e.g., item means or sums) has been frequently used to analyze the growth of latent constructs. This study evaluated whether LGM of composites yields unbiased parameter estimates, standard errors, chi-square statistics, and adequate fit indexes. Furthermore, LGM was compared…

  16. Optimal estimation of diffusion coefficients from single-particle trajectories

    NASA Astrophysics Data System (ADS)

    Vestergaard, Christian L.; Blainey, Paul C.; Flyvbjerg, Henrik

    2014-02-01

    How does one optimally determine the diffusion coefficient of a diffusing particle from a single-time-lapse recorded trajectory of the particle? We answer this question with an explicit, unbiased, and practically optimal covariance-based estimator (CVE). This estimator is regression-free and is far superior to commonly used methods based on measured mean squared displacements. In experimentally relevant parameter ranges, it also outperforms the analytically intractable and computationally more demanding maximum likelihood estimator (MLE). For the case of diffusion on a flexible and fluctuating substrate, the CVE is biased by substrate motion. However, given some long time series and a substrate under some tension, an extended MLE can separate particle diffusion on the substrate from substrate motion in the laboratory frame. This provides benchmarks that allow removal of bias caused by substrate fluctuations in CVE. The resulting unbiased CVE is optimal also for short time series on a fluctuating substrate. We have applied our estimators to human 8-oxoguanine DNA glycolase proteins diffusing on flow-stretched DNA, a fluctuating substrate, and found that diffusion coefficients are severely overestimated if substrate fluctuations are not accounted for.

  17. The influence of SO4 and NO3 to the acidity (pH) of rainwater using minimum variance quadratic unbiased estimation (MIVQUE) and maximum likelihood methods

    NASA Astrophysics Data System (ADS)

    Dilla, Shintia Ulfa; Andriyana, Yudhie; Sudartianto

    2017-03-01

    Acid rain causes many bad effects in life. It is formed by two strong acids, sulfuric acid (H2SO4) and nitric acid (HNO3), where sulfuric acid is derived from SO2 and nitric acid from NOx {x=1,2}. The purpose of the research is to find out the influence of So4 and NO3 levels contained in the rain to the acidity (pH) of rainwater. The data are incomplete panel data with two-way error component model. The panel data is a collection of some of the observations that observed from time to time. It is said incomplete if each individual has a different amount of observation. The model used in this research is in the form of random effects model (REM). Minimum variance quadratic unbiased estimation (MIVQUE) is used to estimate the variance error components, while maximum likelihood estimation is used to estimate the parameters. As a result, we obtain the following model: Ŷ* = 0.41276446 - 0.00107302X1 + 0.00215470X2.

  18. Evaluation of unconfined-aquifer parameters from pumping test data by nonlinear least squares

    NASA Astrophysics Data System (ADS)

    Heidari, Manoutchehr; Wench, Allen

    1997-05-01

    Nonlinear least squares (NLS) with automatic differentiation was used to estimate aquifer parameters from drawdown data obtained from published pumping tests conducted in homogeneous, water-table aquifers. The method is based on a technique that seeks to minimize the squares of residuals between observed and calculated drawdown subject to bounds that are placed on the parameter of interest. The analytical model developed by Neuman for flow to a partially penetrating well of infinitesimal diameter situated in an infinite, homogeneous and anisotropic aquifer was used to obtain calculated drawdown. NLS was first applied to synthetic drawdown data from a hypothetical but realistic aquifer to demonstrate that the relevant hydraulic parameters (storativity, specific yield, and horizontal and vertical hydraulic conductivity) can be evaluated accurately. Next the method was used to estimate the parameters at three field sites with widely varying hydraulic properties. NLS produced unbiased estimates of the aquifer parameters that are close to the estimates obtained with the same data using a visual curve-matching approach. Small differences in the estimates are a consequence of subjective interpretation introduced in the visual approach.

  19. Evaluation of unconfined-aquifer parameters from pumping test data by nonlinear least squares

    USGS Publications Warehouse

    Heidari, M.; Moench, A.

    1997-01-01

    Nonlinear least squares (NLS) with automatic differentiation was used to estimate aquifer parameters from drawdown data obtained from published pumping tests conducted in homogeneous, water-table aquifers. The method is based on a technique that seeks to minimize the squares of residuals between observed and calculated drawdown subject to bounds that are placed on the parameter of interest. The analytical model developed by Neuman for flow to a partially penetrating well of infinitesimal diameter situated in an infinite, homogeneous and anisotropic aquifer was used to obtain calculated drawdown. NLS was first applied to synthetic drawdown data from a hypothetical but realistic aquifer to demonstrate that the relevant hydraulic parameters (storativity, specific yield, and horizontal and vertical hydraulic conductivity) can be evaluated accurately. Next the method was used to estimate the parameters at three field sites with widely varying hydraulic properties. NLS produced unbiased estimates of the aquifer parameters that are close to the estimates obtained with the same data using a visual curve-matching approach. Small differences in the estimates are a consequence of subjective interpretation introduced in the visual approach.

  20. The Efficacy of Galaxy Shape Parameters in Photometric Redshift Estimation: A Neural Network Approach

    NASA Astrophysics Data System (ADS)

    Singal, J.; Shmakova, M.; Gerke, B.; Griffith, R. L.; Lotz, J.

    2011-05-01

    We present a determination of the effects of including galaxy morphological parameters in photometric redshift estimation with an artificial neural network method. Neural networks, which recognize patterns in the information content of data in an unbiased way, can be a useful estimator of the additional information contained in extra parameters, such as those describing morphology, if the input data are treated on an equal footing. We use imaging and five band photometric magnitudes from the All-wavelength Extended Groth Strip International Survey (AEGIS). It is shown that certain principal components of the morphology information are correlated with galaxy type. However, we find that for the data used the inclusion of morphological information does not have a statistically significant benefit for photometric redshift estimation with the techniques employed here. The inclusion of these parameters may result in a tradeoff between extra information and additional noise, with the additional noise becoming more dominant as more parameters are added.

  1. [Application of ordinary Kriging method in entomologic ecology].

    PubMed

    Zhang, Runjie; Zhou, Qiang; Chen, Cuixian; Wang, Shousong

    2003-01-01

    Geostatistics is a statistic method based on regional variables and using the tool of variogram to analyze the spatial structure and the patterns of organism. In simulating the variogram within a great range, though optimal simulation cannot be obtained, the simulation method of a dialogue between human and computer can be used to optimize the parameters of the spherical models. In this paper, the method mentioned above and the weighted polynomial regression were utilized to simulate the one-step spherical model, the two-step spherical model and linear function model, and the available nearby samples were used to draw on the ordinary Kriging procedure, which provided a best linear unbiased estimate of the constraint of the unbiased estimation. The sum of square deviation between the estimating and measuring values of varying theory models were figured out, and the relative graphs were shown. It was showed that the simulation based on the two-step spherical model was the best simulation, and the one-step spherical model was better than the linear function model.

  2. Identification of modal parameters including unmeasured forces and transient effects

    NASA Astrophysics Data System (ADS)

    Cauberghe, B.; Guillaume, P.; Verboven, P.; Parloo, E.

    2003-08-01

    In this paper, a frequency-domain method to estimate modal parameters from short data records with known input (measured) forces and unknown input forces is presented. The method can be used for an experimental modal analysis, an operational modal analysis (output-only data) and the combination of both. A traditional experimental and operational modal analysis in the frequency domain starts respectively, from frequency response functions and spectral density functions. To estimate these functions accurately sufficient data have to be available. The technique developed in this paper estimates the modal parameters directly from the Fourier spectra of the outputs and the known input. Instead of using Hanning windows on these short data records the transient effects are estimated simultaneously with the modal parameters. The method is illustrated, tested and validated by Monte Carlo simulations and experiments. The presented method to process short data sequences leads to unbiased estimates with a small variance in comparison to the more traditional approaches.

  3. Density estimation in wildlife surveys

    USGS Publications Warehouse

    Bart, Jonathan; Droege, Sam; Geissler, Paul E.; Peterjohn, Bruce G.; Ralph, C. John

    2004-01-01

    Several authors have recently discussed the problems with using index methods to estimate trends in population size. Some have expressed the view that index methods should virtually never be used. Others have responded by defending index methods and questioning whether better alternatives exist. We suggest that index methods are often a cost-effective component of valid wildlife monitoring but that double-sampling or another procedure that corrects for bias or establishes bounds on bias is essential. The common assertion that index methods require constant detection rates for trend estimation is mathematically incorrect; the requirement is no long-term trend in detection "ratios" (index result/parameter of interest), a requirement that is probably approximately met by many well-designed index surveys. We urge that more attention be given to defining bird density rigorously and in ways useful to managers. Once this is done, 4 sources of bias in density estimates may be distinguished: coverage, closure, surplus birds, and detection rates. Distance, double-observer, and removal methods do not reduce bias due to coverage, closure, or surplus birds. These methods may yield unbiased estimates of the number of birds present at the time of the survey, but only if their required assumptions are met, which we doubt occurs very often in practice. Double-sampling, in contrast, produces unbiased density estimates if the plots are randomly selected and estimates on the intensive surveys are unbiased. More work is needed, however, to determine the feasibility of double-sampling in different populations and habitats. We believe the tension that has developed over appropriate survey methods can best be resolved through increased appreciation of the mathematical aspects of indices, especially the effects of bias, and through studies in which candidate methods are evaluated against known numbers determined through intensive surveys.

  4. Application of maximum-likelihood estimation in optical coherence tomography for nanometer-class thickness estimation

    NASA Astrophysics Data System (ADS)

    Huang, Jinxin; Yuan, Qun; Tankam, Patrice; Clarkson, Eric; Kupinski, Matthew; Hindman, Holly B.; Aquavella, James V.; Rolland, Jannick P.

    2015-03-01

    In biophotonics imaging, one important and quantitative task is layer-thickness estimation. In this study, we investigate the approach of combining optical coherence tomography and a maximum-likelihood (ML) estimator for layer thickness estimation in the context of tear film imaging. The motivation of this study is to extend our understanding of tear film dynamics, which is the prerequisite to advance the management of Dry Eye Disease, through the simultaneous estimation of the thickness of the tear film lipid and aqueous layers. The estimator takes into account the different statistical processes associated with the imaging chain. We theoretically investigated the impact of key system parameters, such as the axial point spread functions (PSF) and various sources of noise on measurement uncertainty. Simulations show that an OCT system with a 1 μm axial PSF (FWHM) allows unbiased estimates down to nanometers with nanometer precision. In implementation, we built a customized Fourier domain OCT system that operates in the 600 to 1000 nm spectral window and achieves 0.93 micron axial PSF in corneal epithelium. We then validated the theoretical framework with physical phantoms made of custom optical coatings, with layer thicknesses from tens of nanometers to microns. Results demonstrate unbiased nanometer-class thickness estimates in three different physical phantoms.

  5. Pharmacokinetic design optimization in children and estimation of maturation parameters: example of cytochrome P450 3A4.

    PubMed

    Bouillon-Pichault, Marion; Jullien, Vincent; Bazzoli, Caroline; Pons, Gérard; Tod, Michel

    2011-02-01

    The aim of this work was to determine whether optimizing the study design in terms of ages and sampling times for a drug eliminated solely via cytochrome P450 3A4 (CYP3A4) would allow us to accurately estimate the pharmacokinetic parameters throughout the entire childhood timespan, while taking into account age- and weight-related changes. A linear monocompartmental model with first-order absorption was used successively with three different residual error models and previously published pharmacokinetic parameters ("true values"). The optimal ages were established by D-optimization using the CYP3A4 maturation function to create "optimized demographic databases." The post-dose times for each previously selected age were determined by D-optimization using the pharmacokinetic model to create "optimized sparse sampling databases." We simulated concentrations by applying the population pharmacokinetic model to the optimized sparse sampling databases to create optimized concentration databases. The latter were modeled to estimate population pharmacokinetic parameters. We then compared true and estimated parameter values. The established optimal design comprised four age ranges: 0.008 years old (i.e., around 3 days), 0.192 years old (i.e., around 2 months), 1.325 years old, and adults, with the same number of subjects per group and three or four samples per subject, in accordance with the error model. The population pharmacokinetic parameters that we estimated with this design were precise and unbiased (root mean square error [RMSE] and mean prediction error [MPE] less than 11% for clearance and distribution volume and less than 18% for k(a)), whereas the maturation parameters were unbiased but less precise (MPE < 6% and RMSE < 37%). Based on our results, taking growth and maturation into account a priori in a pediatric pharmacokinetic study is theoretically feasible. However, it requires that very early ages be included in studies, which may present an obstacle to the use of this approach. First-pass effects, alternative elimination routes, and combined elimination pathways should also be investigated.

  6. Stretchy binary classification.

    PubMed

    Toh, Kar-Ann; Lin, Zhiping; Sun, Lei; Li, Zhengguo

    2018-01-01

    In this article, we introduce an analytic formulation for compressive binary classification. The formulation seeks to solve the least ℓ p -norm of the parameter vector subject to a classification error constraint. An analytic and stretchable estimation is conjectured where the estimation can be viewed as an extension of the pseudoinverse with left and right constructions. Our variance analysis indicates that the estimation based on the left pseudoinverse is unbiased and the estimation based on the right pseudoinverse is biased. Sparseness can be obtained for the biased estimation under certain mild conditions. The proposed estimation is investigated numerically using both synthetic and real-world data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. ZASPE: A Code to Measure Stellar Atmospheric Parameters and their Covariance from Spectra

    NASA Astrophysics Data System (ADS)

    Brahm, Rafael; Jordán, Andrés; Hartman, Joel; Bakos, Gáspár

    2017-05-01

    We describe the Zonal Atmospheric Stellar Parameters Estimator (zaspe), a new algorithm, and its associated code, for determining precise stellar atmospheric parameters and their uncertainties from high-resolution echelle spectra of FGK-type stars. zaspe estimates stellar atmospheric parameters by comparing the observed spectrum against a grid of synthetic spectra only in the most sensitive spectral zones to changes in the atmospheric parameters. Realistic uncertainties in the parameters are computed from the data itself, by taking into account the systematic mismatches between the observed spectrum and the best-fitting synthetic one. The covariances between the parameters are also estimated in the process. zaspe can in principle use any pre-calculated grid of synthetic spectra, but unbiased grids are required to obtain accurate parameters. We tested the performance of two existing libraries, and we concluded that neither is suitable for computing precise atmospheric parameters. We describe a process to synthesize a new library of synthetic spectra that was found to generate consistent results when compared with parameters obtained with different methods (interferometry, asteroseismology, equivalent widths).

  8. Towards a sampling strategy for the assessment of forest condition at European level: combining country estimates.

    PubMed

    Travaglini, Davide; Fattorini, Lorenzo; Barbati, Anna; Bottalico, Francesca; Corona, Piermaria; Ferretti, Marco; Chirici, Gherardo

    2013-04-01

    A correct characterization of the status and trend of forest condition is essential to support reporting processes at national and international level. An international forest condition monitoring has been implemented in Europe since 1987 under the auspices of the International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests (ICP Forests). The monitoring is based on harmonized methodologies, with individual countries being responsible for its implementation. Due to inconsistencies and problems in sampling design, however, the ICP Forests network is not able to produce reliable quantitative estimates of forest condition at European and sometimes at country level. This paper proposes (1) a set of requirements for status and change assessment and (2) a harmonized sampling strategy able to provide unbiased and consistent estimators of forest condition parameters and of their changes at both country and European level. Under the assumption that a common definition of forest holds among European countries, monitoring objectives, parameters of concern and accuracy indexes are stated. On the basis of fixed-area plot sampling performed independently in each country, an unbiased and consistent estimator of forest defoliation indexes is obtained at both country and European level, together with conservative estimators of their sampling variance and power in the detection of changes. The strategy adopts a probabilistic sampling scheme based on fixed-area plots selected by means of systematic or stratified schemes. Operative guidelines for its application are provided.

  9. Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.

    PubMed

    Kärkkäinen, Hanni P; Sillanpää, Mikko J

    2013-09-04

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.

  10. Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data

    PubMed Central

    Kärkkäinen, Hanni P.; Sillanpää, Mikko J.

    2013-01-01

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618

  11. Identification of internal properties of fibres and micro-swimmers

    NASA Astrophysics Data System (ADS)

    Plouraboué, Franck; Thiam, E. Ibrahima; Delmotte, Blaise; Climent, Eric

    2017-01-01

    In this paper, we address the identifiability of constitutive parameters of passive or active micro-swimmers. We first present a general framework for describing fibres or micro-swimmers using a bead-model description. Using a kinematic constraint formulation to describe fibres, flagellum or cilia, we find explicit linear relationship between elastic constitutive parameters and generalized velocities from computing contact forces. This linear formulation then permits one to address explicitly identifiability conditions and solve for parameter identification. We show that both active forcing and passive parameters are both identifiable independently but not simultaneously. We also provide unbiased estimators for generalized elastic parameters in the presence of Langevin-like forcing with Gaussian noise using a Bayesian approach. These theoretical results are illustrated in various configurations showing the efficiency of the proposed approach for direct parameter identification. The convergence of the proposed estimators is successfully tested numerically.

  12. Occupancy models to study wildlife

    USGS Publications Warehouse

    Bailey, Larissa; Adams, Michael John

    2005-01-01

    Many wildlife studies seek to understand changes or differences in the proportion of sites occupied by a species of interest. These studies are hampered by imperfect detection of these species, which can result in some sites appearing to be unoccupied that are actually occupied. Occupancy models solve this problem and produce unbiased estimates of occupancy and related parameters. Required data (detection/non-detection information) are relatively simple and inexpensive to collect. Software is available free of charge to aid investigators in occupancy estimation.

  13. Fuzzy C-mean clustering on kinetic parameter estimation with generalized linear least square algorithm in SPECT

    NASA Astrophysics Data System (ADS)

    Choi, Hon-Chit; Wen, Lingfeng; Eberl, Stefan; Feng, Dagan

    2006-03-01

    Dynamic Single Photon Emission Computed Tomography (SPECT) has the potential to quantitatively estimate physiological parameters by fitting compartment models to the tracer kinetics. The generalized linear least square method (GLLS) is an efficient method to estimate unbiased kinetic parameters and parametric images. However, due to the low sensitivity of SPECT, noisy data can cause voxel-wise parameter estimation by GLLS to fail. Fuzzy C-Mean (FCM) clustering and modified FCM, which also utilizes information from the immediate neighboring voxels, are proposed to improve the voxel-wise parameter estimation of GLLS. Monte Carlo simulations were performed to generate dynamic SPECT data with different noise levels and processed by general and modified FCM clustering. Parametric images were estimated by Logan and Yokoi graphical analysis and GLLS. The influx rate (K I), volume of distribution (V d) were estimated for the cerebellum, thalamus and frontal cortex. Our results show that (1) FCM reduces the bias and improves the reliability of parameter estimates for noisy data, (2) GLLS provides estimates of micro parameters (K I-k 4) as well as macro parameters, such as volume of distribution (Vd) and binding potential (BP I & BP II) and (3) FCM clustering incorporating neighboring voxel information does not improve the parameter estimates, but improves noise in the parametric images. These findings indicated that it is desirable for pre-segmentation with traditional FCM clustering to generate voxel-wise parametric images with GLLS from dynamic SPECT data.

  14. Uncertainty importance analysis using parametric moment ratio functions.

    PubMed

    Wei, Pengfei; Lu, Zhenzhou; Song, Jingwen

    2014-02-01

    This article presents a new importance analysis framework, called parametric moment ratio function, for measuring the reduction of model output uncertainty when the distribution parameters of inputs are changed, and the emphasis is put on the mean and variance ratio functions with respect to the variances of model inputs. The proposed concepts efficiently guide the analyst to achieve a targeted reduction on the model output mean and variance by operating on the variances of model inputs. The unbiased and progressive unbiased Monte Carlo estimators are also derived for the parametric mean and variance ratio functions, respectively. Only a set of samples is needed for implementing the proposed importance analysis by the proposed estimators, thus the computational cost is free of input dimensionality. An analytical test example with highly nonlinear behavior is introduced for illustrating the engineering significance of the proposed importance analysis technique and verifying the efficiency and convergence of the derived Monte Carlo estimators. Finally, the moment ratio function is applied to a planar 10-bar structure for achieving a targeted 50% reduction of the model output variance. © 2013 Society for Risk Analysis.

  15. Correcting for Measurement Error in Time-Varying Covariates in Marginal Structural Models.

    PubMed

    Kyle, Ryan P; Moodie, Erica E M; Klein, Marina B; Abrahamowicz, Michał

    2016-08-01

    Unbiased estimation of causal parameters from marginal structural models (MSMs) requires a fundamental assumption of no unmeasured confounding. Unfortunately, the time-varying covariates used to obtain inverse probability weights are often error-prone. Although substantial measurement error in important confounders is known to undermine control of confounders in conventional unweighted regression models, this issue has received comparatively limited attention in the MSM literature. Here we propose a novel application of the simulation-extrapolation (SIMEX) procedure to address measurement error in time-varying covariates, and we compare 2 approaches. The direct approach to SIMEX-based correction targets outcome model parameters, while the indirect approach corrects the weights estimated using the exposure model. We assess the performance of the proposed methods in simulations under different clinically plausible assumptions. The simulations demonstrate that measurement errors in time-dependent covariates may induce substantial bias in MSM estimators of causal effects of time-varying exposures, and that both proposed SIMEX approaches yield practically unbiased estimates in scenarios featuring low-to-moderate degrees of error. We illustrate the proposed approach in a simple analysis of the relationship between sustained virological response and liver fibrosis progression among persons infected with hepatitis C virus, while accounting for measurement error in γ-glutamyltransferase, using data collected in the Canadian Co-infection Cohort Study from 2003 to 2014. © The Author 2016. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Maximum likelihood solution for inclination-only data in paleomagnetism

    NASA Astrophysics Data System (ADS)

    Arason, P.; Levi, S.

    2010-08-01

    We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.

  17. Creel survey sampling designs for estimating effort in short-duration Chinook salmon fisheries

    USGS Publications Warehouse

    McCormick, Joshua L.; Quist, Michael C.; Schill, Daniel J.

    2013-01-01

    Chinook Salmon Oncorhynchus tshawytscha sport fisheries in the Columbia River basin are commonly monitored using roving creel survey designs and require precise, unbiased catch estimates. The objective of this study was to examine the relative bias and precision of total catch estimates using various sampling designs to estimate angling effort under the assumption that mean catch rate was known. We obtained information on angling populations based on direct visual observations of portions of Chinook Salmon fisheries in three Idaho river systems over a 23-d period. Based on the angling population, Monte Carlo simulations were used to evaluate the properties of effort and catch estimates for each sampling design. All sampling designs evaluated were relatively unbiased. Systematic random sampling (SYS) resulted in the most precise estimates. The SYS and simple random sampling designs had mean square error (MSE) estimates that were generally half of those observed with cluster sampling designs. The SYS design was more efficient (i.e., higher accuracy per unit cost) than a two-cluster design. Increasing the number of clusters available for sampling within a day decreased the MSE of estimates of daily angling effort, but the MSE of total catch estimates was variable depending on the fishery. The results of our simulations provide guidelines on the relative influence of sample sizes and sampling designs on parameters of interest in short-duration Chinook Salmon fisheries.

  18. Unbiased Estimates of Variance Components with Bootstrap Procedures

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2007-01-01

    This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…

  19. Contextual classification of multispectral image data: An unbiased estimator for the context distribution

    NASA Technical Reports Server (NTRS)

    Tilton, J. C.; Swain, P. H. (Principal Investigator); Vardeman, S. B.

    1981-01-01

    A key input to a statistical classification algorithm, which exploits the tendency of certain ground cover classes to occur more frequently in some spatial context than in others, is a statistical characterization of the context: the context distribution. An unbiased estimator of the context distribution is discussed which, besides having the advantage of statistical unbiasedness, has the additional advantage over other estimation techniques of being amenable to an adaptive implementation in which the context distribution estimate varies according to local contextual information. Results from applying the unbiased estimator to the contextual classification of three real LANDSAT data sets are presented and contrasted with results from non-contextual classifications and from contextual classifications utilizing other context distribution estimation techniques.

  20. VA-Index: Quantifying Assortativity Patterns in Networks with Multidimensional Nodal Attributes (Open Access)

    DTIC Science & Technology

    2016-01-27

    bias of the estimator U, bias(U), the difference between this estimator’s expected value and the true value of the parameter being estimated, i.e...biasðUÞ ¼ EðU yÞ ¼ EðUÞ y ð9Þ Based on the above definition, an unbiased estimator is one whose expected value is equal to the true value being...equal to 0.94 (p- value < 0.05), if we con- sider the pure ER network model as our baseline, and 0.31 (p- value < 0.05), if we control for the home

  1. Quantifying Transmission Heterogeneity Using Both Pathogen Phylogenies and Incidence Time Series

    PubMed Central

    Li, Lucy M.; Grassly, Nicholas C.; Fraser, Christophe

    2017-01-01

    Abstract Heterogeneity in individual-level transmissibility can be quantified by the dispersion parameter k of the offspring distribution. Quantifying heterogeneity is important as it affects other parameter estimates, it modulates the degree of unpredictability of an epidemic, and it needs to be accounted for in models of infection control. Aggregated data such as incidence time series are often not sufficiently informative to estimate k. Incorporating phylogenetic analysis can help to estimate k concurrently with other epidemiological parameters. We have developed an inference framework that uses particle Markov Chain Monte Carlo to estimate k and other epidemiological parameters using both incidence time series and the pathogen phylogeny. Using the framework to fit a modified compartmental transmission model that includes the parameter k to simulated data, we found that more accurate and less biased estimates of the reproductive number were obtained by combining epidemiological and phylogenetic analyses. However, k was most accurately estimated using pathogen phylogeny alone. Accurately estimating k was necessary for unbiased estimates of the reproductive number, but it did not affect the accuracy of reporting probability and epidemic start date estimates. We further demonstrated that inference was possible in the presence of phylogenetic uncertainty by sampling from the posterior distribution of phylogenies. Finally, we used the inference framework to estimate transmission parameters from epidemiological and genetic data collected during a poliovirus outbreak. Despite the large degree of phylogenetic uncertainty, we demonstrated that incorporating phylogenetic data in parameter inference improved the accuracy and precision of estimates. PMID:28981709

  2. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  3. Effects of social organization, trap arrangement and density, sampling scale, and population density on bias in population size estimation using some common mark-recapture estimators.

    PubMed

    Gupta, Manan; Joshi, Amitabh; Vidya, T N C

    2017-01-01

    Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species.

  4. Effects of social organization, trap arrangement and density, sampling scale, and population density on bias in population size estimation using some common mark-recapture estimators

    PubMed Central

    Joshi, Amitabh; Vidya, T. N. C.

    2017-01-01

    Mark-recapture estimators are commonly used for population size estimation, and typically yield unbiased estimates for most solitary species with low to moderate home range sizes. However, these methods assume independence of captures among individuals, an assumption that is clearly violated in social species that show fission-fusion dynamics, such as the Asian elephant. In the specific case of Asian elephants, doubts have been raised about the accuracy of population size estimates. More importantly, the potential problem for the use of mark-recapture methods posed by social organization in general has not been systematically addressed. We developed an individual-based simulation framework to systematically examine the potential effects of type of social organization, as well as other factors such as trap density and arrangement, spatial scale of sampling, and population density, on bias in population sizes estimated by POPAN, Robust Design, and Robust Design with detection heterogeneity. In the present study, we ran simulations with biological, demographic and ecological parameters relevant to Asian elephant populations, but the simulation framework is easily extended to address questions relevant to other social species. We collected capture history data from the simulations, and used those data to test for bias in population size estimation. Social organization significantly affected bias in most analyses, but the effect sizes were variable, depending on other factors. Social organization tended to introduce large bias when trap arrangement was uniform and sampling effort was low. POPAN clearly outperformed the two Robust Design models we tested, yielding close to zero bias if traps were arranged at random in the study area, and when population density and trap density were not too low. Social organization did not have a major effect on bias for these parameter combinations at which POPAN gave more or less unbiased population size estimates. Therefore, the effect of social organization on bias in population estimation could be removed by using POPAN with specific parameter combinations, to obtain population size estimates in a social species. PMID:28306735

  5. Building unbiased estimators from non-gaussian likelihoods with application to shear estimation

    DOE PAGES

    Madhavacheril, Mathew S.; McDonald, Patrick; Sehgal, Neelima; ...

    2015-01-15

    We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong’s estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g| = 0.2.« less

  6. Building unbiased estimators from non-Gaussian likelihoods with application to shear estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Madhavacheril, Mathew S.; Sehgal, Neelima; McDonald, Patrick

    2015-01-01

    We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong's estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g|=0.2.« less

  7. A spatially explicit capture-recapture estimator for single-catch traps.

    PubMed

    Distiller, Greg; Borchers, David L

    2015-11-01

    Single-catch traps are frequently used in live-trapping studies of small mammals. Thus far, a likelihood for single-catch traps has proven elusive and usually the likelihood for multicatch traps is used for spatially explicit capture-recapture (SECR) analyses of such data. Previous work found the multicatch likelihood to provide a robust estimator of average density. We build on a recently developed continuous-time model for SECR to derive a likelihood for single-catch traps. We use this to develop an estimator based on observed capture times and compare its performance by simulation to that of the multicatch estimator for various scenarios with nonconstant density surfaces. While the multicatch estimator is found to be a surprisingly robust estimator of average density, its performance deteriorates with high trap saturation and increasing density gradients. Moreover, it is found to be a poor estimator of the height of the detection function. By contrast, the single-catch estimators of density, distribution, and detection function parameters are found to be unbiased or nearly unbiased in all scenarios considered. This gain comes at the cost of higher variance. If there is no interest in interpreting the detection function parameters themselves, and if density is expected to be fairly constant over the survey region, then the multicatch estimator performs well with single-catch traps. However if accurate estimation of the detection function is of interest, or if density is expected to vary substantially in space, then there is merit in using the single-catch estimator when trap saturation is above about 60%. The estimator's performance is improved if care is taken to place traps so as to span the range of variables that affect animal distribution. As a single-catch likelihood with unknown capture times remains intractable for now, researchers using single-catch traps should aim to incorporate timing devices with their traps.

  8. Estimation of the simple correlation coefficient.

    PubMed

    Shieh, Gwowen

    2010-11-01

    This article investigates some unfamiliar properties of the Pearson product-moment correlation coefficient for the estimation of simple correlation coefficient. Although Pearson's r is biased, except for limited situations, and the minimum variance unbiased estimator has been proposed in the literature, researchers routinely employ the sample correlation coefficient in their practical applications, because of its simplicity and popularity. In order to support such practice, this study examines the mean squared errors of r and several prominent formulas. The results reveal specific situations in which the sample correlation coefficient performs better than the unbiased and nearly unbiased estimators, facilitating recommendation of r as an effect size index for the strength of linear association between two variables. In addition, related issues of estimating the squared simple correlation coefficient are also considered.

  9. Finite mixture model: A maximum likelihood estimation approach on time series data

    NASA Astrophysics Data System (ADS)

    Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad

    2014-09-01

    Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.

  10. Perturbation analysis of queueing systems with a time-varying arrival rate

    NASA Technical Reports Server (NTRS)

    Cassandras, Christos G.; Pan, Jie

    1991-01-01

    The authors consider an M/G/1 queuing with a time-varying arrival rate. The objective is to obtain infinitesimal perturbation analysis (IPA) gradient estimates for various performance measures of interest with respect to certain system parameters. In particular, the authors consider the mean system time over n arrivals and an arrival rate alternating between two values. By choosing a convenient sample path representation of this system, they derive an unbiased IPA gradient estimator which, however, is not consistent, and investigate the nature of this problem.

  11. Extreme Mean and Its Applications

    NASA Technical Reports Server (NTRS)

    Swaroop, R.; Brownlow, J. D.

    1979-01-01

    Extreme value statistics obtained from normally distributed data are considered. An extreme mean is defined as the mean of p-th probability truncated normal distribution. An unbiased estimate of this extreme mean and its large sample distribution are derived. The distribution of this estimate even for very large samples is found to be nonnormal. Further, as the sample size increases, the variance of the unbiased estimate converges to the Cramer-Rao lower bound. The computer program used to obtain the density and distribution functions of the standardized unbiased estimate, and the confidence intervals of the extreme mean for any data are included for ready application. An example is included to demonstrate the usefulness of extreme mean application.

  12. Movement patterns and study area boundaries: Influences on survival estimation in capture-mark-recapture studies

    USGS Publications Warehouse

    Horton, G.E.; Letcher, B.H.

    2008-01-01

    The inability to account for the availability of individuals in the study area during capture-mark-recapture (CMR) studies and the resultant confounding of parameter estimates can make correct interpretation of CMR model parameter estimates difficult. Although important advances based on the Cormack-Jolly-Seber (CJS) model have resulted in estimators of true survival that work by unconfounding either death or recapture probability from availability for capture in the study area, these methods rely on the researcher's ability to select a method that is correctly matched to emigration patterns in the population. If incorrect assumptions regarding site fidelity (non-movement) are made, it may be difficult or impossible as well as costly to change the study design once the incorrect assumption is discovered. Subtleties in characteristics of movement (e.g. life history-dependent emigration, nomads vs territory holders) can lead to mixtures in the probability of being available for capture among members of the same population. The result of these mixtures may be only a partial unconfounding of emigration from other CMR model parameters. Biologically-based differences in individual movement can combine with constraints on study design to further complicate the problem. Because of the intricacies of movement and its interaction with other parameters in CMR models, quantification of and solutions to these problems are needed. Based on our work with stream-dwelling populations of Atlantic salmon Salmo salar, we used a simulation approach to evaluate existing CMR models under various mixtures of movement probabilities. The Barker joint data model provided unbiased estimates of true survival under all conditions tested. The CJS and robust design models provided similarly unbiased estimates of true survival but only when emigration information could be incorporated directly into individual encounter histories. For the robust design model, Markovian emigration (future availability for capture depends on an individual's current location) was a difficult emigration pattern to detect unless survival and especially recapture probability were high. Additionally, when local movement was high relative to study area boundaries and movement became more diffuse (e.g. a random walk), local movement and permanent emigration were difficult to distinguish and had consequences for correctly interpreting the survival parameter being estimated (apparent survival vs true survival). ?? 2008 The Authors.

  13. Calculation of Weibull strength parameters and Batdorf flow-density constants for volume- and surface-flaw-induced fracture in ceramics

    NASA Technical Reports Server (NTRS)

    Shantaram, S. Pai; Gyekenyesi, John P.

    1989-01-01

    The calculation of shape and scale parametes of the two-parameter Weibull distribution is described using the least-squares analysis and maximum likelihood methods for volume- and surface-flaw-induced fracture in ceramics with complete and censored samples. Detailed procedures are given for evaluating 90 percent confidence intervals for maximum likelihood estimates of shape and scale parameters, the unbiased estimates of the shape parameters, and the Weibull mean values and corresponding standard deviations. Furthermore, the necessary steps are described for detecting outliers and for calculating the Kolmogorov-Smirnov and the Anderson-Darling goodness-of-fit statistics and 90 percent confidence bands about the Weibull distribution. It also shows how to calculate the Batdorf flaw-density constants by using the Weibull distribution statistical parameters. The techniques described were verified with several example problems, from the open literature, and were coded in the Structural Ceramics Analysis and Reliability Evaluation (SCARE) design program.

  14. Identification of internal properties of fibers and micro-swimmers

    NASA Astrophysics Data System (ADS)

    Plouraboue, Franck; Thiam, Ibrahima; Delmotte, Blaise; Climent, Eric; PSC Collaboration

    2016-11-01

    In this presentation we discuss the identifiability of constitutive parameters of passive or active micro-swimmers. We first present a general framework for describing fibers or micro-swimmers using a bead-model description. Using a kinematic constraint formulation to describe fibers, flagellum or cilia, we find explicit linear relationship between elastic constitutive parameters and generalised velocities from computing contact forces. This linear formulation then permits to address explicitly identifiability conditions and solve for parameter identification. We show that both active forcing and passive parameters are both identifiable independently but not simultaneously. We also provide unbiased estimators for elastic parameters as well as active ones in the presence of Langevin-like forcing with Gaussian noise using normal linear regression models and maximum likelihood method. These theoretical results are illustrated in various configurations of relaxed or actuated passives fibers, and active filament of known passive properties, showing the efficiency of the proposed approach for direct parameter identification. The convergence of the proposed estimators is successfully tested numerically.

  15. Fitting N-mixture models to count data with unmodeled heterogeneity: Bias, diagnostics, and alternative approaches

    USGS Publications Warehouse

    Duarte, Adam; Adams, Michael J.; Peterson, James T.

    2018-01-01

    Monitoring animal populations is central to wildlife and fisheries management, and the use of N-mixture models toward these efforts has markedly increased in recent years. Nevertheless, relatively little work has evaluated estimator performance when basic assumptions are violated. Moreover, diagnostics to identify when bias in parameter estimates from N-mixture models is likely is largely unexplored. We simulated count data sets using 837 combinations of detection probability, number of sample units, number of survey occasions, and type and extent of heterogeneity in abundance or detectability. We fit Poisson N-mixture models to these data, quantified the bias associated with each combination, and evaluated if the parametric bootstrap goodness-of-fit (GOF) test can be used to indicate bias in parameter estimates. We also explored if assumption violations can be diagnosed prior to fitting N-mixture models. In doing so, we propose a new model diagnostic, which we term the quasi-coefficient of variation (QCV). N-mixture models performed well when assumptions were met and detection probabilities were moderate (i.e., ≥0.3), and the performance of the estimator improved with increasing survey occasions and sample units. However, the magnitude of bias in estimated mean abundance with even slight amounts of unmodeled heterogeneity was substantial. The parametric bootstrap GOF test did not perform well as a diagnostic for bias in parameter estimates when detectability and sample sizes were low. The results indicate the QCV is useful to diagnose potential bias and that potential bias associated with unidirectional trends in abundance or detectability can be diagnosed using Poisson regression. This study represents the most thorough assessment to date of assumption violations and diagnostics when fitting N-mixture models using the most commonly implemented error distribution. Unbiased estimates of population state variables are needed to properly inform management decision making. Therefore, we also discuss alternative approaches to yield unbiased estimates of population state variables using similar data types, and we stress that there is no substitute for an effective sample design that is grounded upon well-defined management objectives.

  16. Simultaneous unbiased estimates of multiple downed wood attributes in perpendicular distance sampling

    Treesearch

    Mark J. Ducey; Jeffrey H. Gove; Harry T. Valentine

    2008-01-01

    Perpendicular distance sampling (PDS) is a fast probability-proportional-to-size method for inventory of downed wood. However, previous development of PDS had limited the method to estimating only one variable (such as volume per hectare, or surface area per hectare) at a time. Here, we develop a general design-unbiased estimator for PDS. We then show how that...

  17. Current and efficiency of Brownian particles under oscillating forces in entropic barriers

    NASA Astrophysics Data System (ADS)

    Nutku, Ferhat; Aydιner, Ekrem

    2015-04-01

    In this study, considering the temporarily unbiased force and different forms of oscillating forces, we investigate the current and efficiency of Brownian particles in an entropic tube structure and present the numerically obtained results. We show that different force forms give rise to different current and efficiency profiles in different optimized parameter intervals. We find that an unbiased oscillating force and an unbiased temporal force lead to the current and efficiency, which are dependent on these parameters. We also observe that the current and efficiency caused by temporal and different oscillating forces have maximum and minimum values in different parameter intervals. We conclude that the current or efficiency can be controlled dynamically by adjusting the parameters of entropic barriers and applied force. Project supported by the Funds from Istanbul University (Grant No. 45662).

  18. Statistical and population genetics issues of two Hungarian datasets from the aspect of DNA evidence interpretation.

    PubMed

    Szabolcsi, Zoltán; Farkas, Zsuzsa; Borbély, Andrea; Bárány, Gusztáv; Varga, Dániel; Heinrich, Attila; Völgyi, Antónia; Pamjav, Horolma

    2015-11-01

    When the DNA profile from a crime-scene matches that of a suspect, the weight of DNA evidence depends on the unbiased estimation of the match probability of the profiles. For this reason, it is required to establish and expand the databases that reflect the actual allele frequencies in the population applied. 21,473 complete DNA profiles from Databank samples were used to establish the allele frequency database to represent the population of Hungarian suspects. We used fifteen STR loci (PowerPlex ESI16) including five, new ESS loci. The aim was to calculate the statistical, forensic efficiency parameters for the Databank samples and compare the newly detected data to the earlier report. The population substructure caused by relatedness may influence the frequency of profiles estimated. As our Databank profiles were considered non-random samples, possible relationships between the suspects can be assumed. Therefore, population inbreeding effect was estimated using the FIS calculation. The overall inbreeding parameter was found to be 0.0106. Furthermore, we tested the impact of the two allele frequency datasets on 101 randomly chosen STR profiles, including full and partial profiles. The 95% confidence interval estimates for the profile frequencies (pM) resulted in a tighter range when we used the new dataset compared to the previously published ones. We found that the FIS had less effect on frequency values in the 21,473 samples than the application of minimum allele frequency. No genetic substructure was detected by STRUCTURE analysis. Due to the low level of inbreeding effect and the high number of samples, the new dataset provides unbiased and precise estimates of LR for statistical interpretation of forensic casework and allows us to use lower allele frequencies. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  19. Statistics as Unbiased Estimators: Exploring the Teaching of Standard Deviation

    ERIC Educational Resources Information Center

    Wasserman, Nicholas H.; Casey, Stephanie; Champion, Joe; Huey, Maryann

    2017-01-01

    This manuscript presents findings from a study about the knowledge for and planned teaching of standard deviation. We investigate how understanding variance as an unbiased (inferential) estimator--not just a descriptive statistic for the variation (spread) in data--is related to teachers' instruction regarding standard deviation, particularly…

  20. Population genetics of polymorphism and divergence for diploid selection models with arbitrary dominance.

    PubMed

    Williamson, Scott; Fledel-Alon, Adi; Bustamante, Carlos D

    2004-09-01

    We develop a Poisson random-field model of polymorphism and divergence that allows arbitrary dominance relations in a diploid context. This model provides a maximum-likelihood framework for estimating both selection and dominance parameters of new mutations using information on the frequency spectrum of sequence polymorphisms. This is the first DNA sequence-based estimator of the dominance parameter. Our model also leads to a likelihood-ratio test for distinguishing nongenic from genic selection; simulations indicate that this test is quite powerful when a large number of segregating sites are available. We also use simulations to explore the bias in selection parameter estimates caused by unacknowledged dominance relations. When inference is based on the frequency spectrum of polymorphisms, genic selection estimates of the selection parameter can be very strongly biased even for minor deviations from the genic selection model. Surprisingly, however, when inference is based on polymorphism and divergence (McDonald-Kreitman) data, genic selection estimates of the selection parameter are nearly unbiased, even for completely dominant or recessive mutations. Further, we find that weak overdominant selection can increase, rather than decrease, the substitution rate relative to levels of polymorphism. This nonintuitive result has major implications for the interpretation of several popular tests of neutrality.

  1. Investigation of Procedures for Automatic Resonance Extraction from Noisy Transient Electromagnetics Data. Volume III. Translation of Prony’s Original Paper and Bibliography of Prony’s Method

    DTIC Science & Technology

    1981-08-17

    Van Blaricum, "On the Source of Parameter Bias in Prony’s Method," 1980 NEM Conference, Disneyland Hotel, August 1980. Auton, J.R., "An Unbiased...Method for the Estimation of the SEM Parameters of an Electromagnetic System," 1980 NEM Conference, Disneyland Hotel, August 1980. Auton, J.R. and M.L...34 1980 NEM Conference, Disneyland Hotel, August 5-7, 1980. Chuang, C.W. and D.L. Moffatt, "Complex Natural Responances of Radar Targets via Prony’s

  2. The dependability of medical students' performance ratings as documented on in-training evaluations.

    PubMed

    van Barneveld, Christina

    2005-03-01

    To demonstrate an approach to obtain an unbiased estimate of the dependability of students' performance ratings during training, when the data-collection design includes nesting of student in rater, unbalanced nest sizes, and dependent observations. In 2003, two variance components analyses of in-training evaluation (ITE) report data were conducted using urGENOVA software. In the first analysis, the dependability for the nested and unbalanced data-collection design was calculated. In the second analysis, an approach using multiple generalizability studies was used to obtain an unbiased estimate of the student variance component, resulting in an unbiased estimate of dependability. Results suggested that there is bias in estimates of the dependability of students' performance on ITEs that are attributable to the data-collection design. When the bias was corrected, the results indicated that the dependability of ratings of student performance was almost zero. The combination of the multiple generalizability studies method and the use of specialized software provides an unbiased estimate of the dependability of ratings of student performance on ITE scores for data-collection designs that include nesting of student in rater, unbalanced nest sizes, and dependent observations.

  3. Unbiased estimators for spatial distribution functions of classical fluids

    NASA Astrophysics Data System (ADS)

    Adib, Artur B.; Jarzynski, Christopher

    2005-01-01

    We use a statistical-mechanical identity closely related to the familiar virial theorem, to derive unbiased estimators for spatial distribution functions of classical fluids. In particular, we obtain estimators for both the fluid density ρ(r) in the vicinity of a fixed solute and the pair correlation g(r) of a homogeneous classical fluid. We illustrate the utility of our estimators with numerical examples, which reveal advantages over traditional histogram-based methods of computing such distributions.

  4. Toward unbiased estimations of the statefinder parameters

    NASA Astrophysics Data System (ADS)

    Aviles, Alejandro; Klapp, Jaime; Luongo, Orlando

    2017-09-01

    With the use of simulated supernova catalogs, we show that the statefinder parameters turn out to be poorly and biased estimated by standard cosmography. To this end, we compute their standard deviations and several bias statistics on cosmologies near the concordance model, demonstrating that these are very large, making standard cosmography unsuitable for future and wider compilations of data. To overcome this issue, we propose a new method that consists in introducing the series of the Hubble function into the luminosity distance, instead of considering the usual direct Taylor expansions of the luminosity distance. Moreover, in order to speed up the numerical computations, we estimate the coefficients of our expansions in a hierarchical manner, in which the order of the expansion depends on the redshift of every single piece of data. In addition, we propose two hybrids methods that incorporates standard cosmography at low redshifts. The methods presented here perform better than the standard approach of cosmography both in the errors and bias of the estimated statefinders. We further propose a one-parameter diagnostic to reject non-viable methods in cosmography.

  5. Supersensitive ancilla-based adaptive quantum phase estimation

    NASA Astrophysics Data System (ADS)

    Larson, Walker; Saleh, Bahaa E. A.

    2017-10-01

    The supersensitivity attained in quantum phase estimation is known to be compromised in the presence of decoherence. This is particularly patent at blind spots—phase values at which sensitivity is totally lost. One remedy is to use a precisely known reference phase to shift the operation point to a less vulnerable phase value. Since this is not always feasible, we present here an alternative approach based on combining the probe with an ancillary degree of freedom containing adjustable parameters to create an entangled quantum state of higher dimension. We validate this concept by simulating a configuration of a Mach-Zehnder interferometer with a two-photon probe and a polarization ancilla of adjustable parameters, entangled at a polarizing beam splitter. At the interferometer output, the photons are measured after an adjustable unitary transformation in the polarization subspace. Through calculation of the Fisher information and simulation of an estimation procedure, we show that optimizing the adjustable polarization parameters using an adaptive measurement process provides globally supersensitive unbiased phase estimates for a range of decoherence levels, without prior information or a reference phase.

  6. Signal detection theory and vestibular perception: III. Estimating unbiased fit parameters for psychometric functions.

    PubMed

    Chaudhuri, Shomesh E; Merfeld, Daniel M

    2013-03-01

    Psychophysics generally relies on estimating a subject's ability to perform a specific task as a function of an observed stimulus. For threshold studies, the fitted functions are called psychometric functions. While fitting psychometric functions to data acquired using adaptive sampling procedures (e.g., "staircase" procedures), investigators have encountered a bias in the spread ("slope" or "threshold") parameter that has been attributed to the serial dependency of the adaptive data. Using simulations, we confirm this bias for cumulative Gaussian parametric maximum likelihood fits on data collected via adaptive sampling procedures, and then present a bias-reduced maximum likelihood fit that substantially reduces the bias without reducing the precision of the spread parameter estimate and without reducing the accuracy or precision of the other fit parameters. As a separate topic, we explain how to implement this bias reduction technique using generalized linear model fits as well as other numeric maximum likelihood techniques such as the Nelder-Mead simplex. We then provide a comparison of the iterative bootstrap and observed information matrix techniques for estimating parameter fit variance from adaptive sampling procedure data sets. The iterative bootstrap technique is shown to be slightly more accurate; however, the observed information technique executes in a small fraction (0.005 %) of the time required by the iterative bootstrap technique, which is an advantage when a real-time estimate of parameter fit variance is required.

  7. Conservative classical and quantum resolution limits for incoherent imaging

    NASA Astrophysics Data System (ADS)

    Tsang, Mankei

    2018-06-01

    I propose classical and quantum limits to the statistical resolution of two incoherent optical point sources from the perspective of minimax parameter estimation. Unlike earlier results based on the Cramér-Rao bound (CRB), the limits proposed here, based on the worst-case error criterion and a Bayesian version of the CRB, are valid for any biased or unbiased estimator and obey photon-number scalings that are consistent with the behaviours of actual estimators. These results prove that, from the minimax perspective, the spatial-mode demultiplexing measurement scheme recently proposed by Tsang, Nair, and Lu [Phys. Rev. X 2016, 6 031033.] remains superior to direct imaging for sufficiently high photon numbers.

  8. Absolute magnitude calibration using trigonometric parallax - Incomplete, spectroscopic samples

    NASA Technical Reports Server (NTRS)

    Ratnatunga, Kavan U.; Casertano, Stefano

    1991-01-01

    A new numerical algorithm is used to calibrate the absolute magnitude of spectroscopically selected stars from their observed trigonometric parallax. This procedure, based on maximum-likelihood estimation, can retrieve unbiased estimates of the intrinsic absolute magnitude and its dispersion even from incomplete samples suffering from selection biases in apparent magnitude and color. It can also make full use of low accuracy and negative parallaxes and incorporate censorship on reported parallax values. Accurate error estimates are derived for each of the fitted parameters. The algorithm allows an a posteriori check of whether the fitted model gives a good representation of the observations. The procedure is described in general and applied to both real and simulated data.

  9. Five instruments for measuring tree height: an evaluation

    Treesearch

    Michael S. Williams; William A. Bechtold; V.J. LaBau

    1994-01-01

    Five instruments were tested for reliability in measuring tree heights under realistic conditions. Four linear models were used to determine if tree height can be measured unbiasedly over all tree sizes and if any of the instruments were more efficient in estimating tree height. The laser height finder was the only instrument to produce unbiased estimates of the true...

  10. Calculation of Weibull strength parameters and Batdorf flow-density constants for volume- and surface-flaw-induced fracture in ceramics

    NASA Technical Reports Server (NTRS)

    Pai, Shantaram S.; Gyekenyesi, John P.

    1988-01-01

    The calculation of shape and scale parameters of the two-parameter Weibull distribution is described using the least-squares analysis and maximum likelihood methods for volume- and surface-flaw-induced fracture in ceramics with complete and censored samples. Detailed procedures are given for evaluating 90 percent confidence intervals for maximum likelihood estimates of shape and scale parameters, the unbiased estimates of the shape parameters, and the Weibull mean values and corresponding standard deviations. Furthermore, the necessary steps are described for detecting outliers and for calculating the Kolmogorov-Smirnov and the Anderson-Darling goodness-of-fit statistics and 90 percent confidence bands about the Weibull distribution. It also shows how to calculate the Batdorf flaw-density constants by uing the Weibull distribution statistical parameters. The techniques described were verified with several example problems, from the open literature, and were coded. The techniques described were verified with several example problems from the open literature, and were coded in the Structural Ceramics Analysis and Reliability Evaluation (SCARE) design program.

  11. A Simple Joint Estimation Method of Residual Frequency Offset and Sampling Frequency Offset for DVB Systems

    NASA Astrophysics Data System (ADS)

    Kwon, Ki-Won; Cho, Yongsoo

    This letter presents a simple joint estimation method for residual frequency offset (RFO) and sampling frequency offset (STO) in OFDM-based digital video broadcasting (DVB) systems. The proposed method selects a continual pilot (CP) subset from an unsymmetrically and non-uniformly distributed CP set to obtain an unbiased estimator. Simulation results show that the proposed method using a properly selected CP subset is unbiased and performs robustly.

  12. Model Reduction via Principe Component Analysis and Markov Chain Monte Carlo (MCMC) Methods

    NASA Astrophysics Data System (ADS)

    Gong, R.; Chen, J.; Hoversten, M. G.; Luo, J.

    2011-12-01

    Geophysical and hydrogeological inverse problems often include a large number of unknown parameters, ranging from hundreds to millions, depending on parameterization and problems undertaking. This makes inverse estimation and uncertainty quantification very challenging, especially for those problems in two- or three-dimensional spatial domains. Model reduction technique has the potential of mitigating the curse of dimensionality by reducing total numbers of unknowns while describing the complex subsurface systems adequately. In this study, we explore the use of principal component analysis (PCA) and Markov chain Monte Carlo (MCMC) sampling methods for model reduction through the use of synthetic datasets. We compare the performances of three different but closely related model reduction approaches: (1) PCA methods with geometric sampling (referred to as 'Method 1'), (2) PCA methods with MCMC sampling (referred to as 'Method 2'), and (3) PCA methods with MCMC sampling and inclusion of random effects (referred to as 'Method 3'). We consider a simple convolution model with five unknown parameters as our goal is to understand and visualize the advantages and disadvantages of each method by comparing their inversion results with the corresponding analytical solutions. We generated synthetic data with noise added and invert them under two different situations: (1) the noised data and the covariance matrix for PCA analysis are consistent (referred to as the unbiased case), and (2) the noise data and the covariance matrix are inconsistent (referred to as biased case). In the unbiased case, comparison between the analytical solutions and the inversion results show that all three methods provide good estimates of the true values and Method 1 is computationally more efficient. In terms of uncertainty quantification, Method 1 performs poorly because of relatively small number of samples obtained, Method 2 performs best, and Method 3 overestimates uncertainty due to inclusion of random effects. However, in the biased case, only Method 3 correctly estimates all the unknown parameters, and both Methods 1 and 2 provide wrong values for the biased parameters. The synthetic case study demonstrates that if the covariance matrix for PCA analysis is inconsistent with true models, the PCA methods with geometric or MCMC sampling will provide incorrect estimates.

  13. Mutually unbiased bases and semi-definite programming

    NASA Astrophysics Data System (ADS)

    Brierley, Stephen; Weigert, Stefan

    2010-11-01

    A complex Hilbert space of dimension six supports at least three but not more than seven mutually unbiased bases. Two computer-aided analytical methods to tighten these bounds are reviewed, based on a discretization of parameter space and on Gröbner bases. A third algorithmic approach is presented: the non-existence of more than three mutually unbiased bases in composite dimensions can be decided by a global optimization method known as semidefinite programming. The method is used to confirm that the spectral matrix cannot be part of a complete set of seven mutually unbiased bases in dimension six.

  14. Time-Varying Delay Estimation Applied to the Surface Electromyography Signals Using the Parametric Approach

    NASA Astrophysics Data System (ADS)

    Luu, Gia Thien; Boualem, Abdelbassit; Duy, Tran Trung; Ravier, Philippe; Butteli, Olivier

    Muscle Fiber Conduction Velocity (MFCV) can be calculated from the time delay between the surface electromyographic (sEMG) signals recorded by electrodes aligned with the fiber direction. In order to take into account the non-stationarity during the dynamic contraction (the most daily life situation) of the data, the developed methods have to consider that the MFCV changes over time, which induces time-varying delays and the data is non-stationary (change of Power Spectral Density (PSD)). In this paper, the problem of TVD estimation is considered using a parametric method. First, the polynomial model of TVD has been proposed. Then, the TVD model parameters are estimated by using a maximum likelihood estimation (MLE) strategy solved by a deterministic optimization technique (Newton) and stochastic optimization technique, called simulated annealing (SA). The performance of the two techniques is also compared. We also derive two appropriate Cramer-Rao Lower Bounds (CRLB) for the estimated TVD model parameters and for the TVD waveforms. Monte-Carlo simulation results show that the estimation of both the model parameters and the TVD function is unbiased and that the variance obtained is close to the derived CRBs. A comparison with non-parametric approaches of the TVD estimation is also presented and shows the superiority of the method proposed.

  15. Angular motion estimation using dynamic models in a gyro-free inertial measurement unit.

    PubMed

    Edwan, Ezzaldeen; Knedlik, Stefan; Loffeld, Otmar

    2012-01-01

    In this paper, we summarize the results of using dynamic models borrowed from tracking theory in describing the time evolution of the state vector to have an estimate of the angular motion in a gyro-free inertial measurement unit (GF-IMU). The GF-IMU is a special type inertial measurement unit (IMU) that uses only a set of accelerometers in inferring the angular motion. Using distributed accelerometers, we get an angular information vector (AIV) composed of angular acceleration and quadratic angular velocity terms. We use a Kalman filter approach to estimate the angular velocity vector since it is not expressed explicitly within the AIV. The bias parameters inherent in the accelerometers measurements' produce a biased AIV and hence the AIV bias parameters are estimated within an augmented state vector. Using dynamic models, the appended bias parameters of the AIV become observable and hence we can have unbiased angular motion estimate. Moreover, a good model is required to extract the maximum amount of information from the observation. Observability analysis is done to determine the conditions for having an observable state space model. For higher grades of accelerometers and under relatively higher sampling frequency, the error of accelerometer measurements is dominated by the noise error. Consequently, simulations are conducted on two models, one has bias parameters appended in the state space model and the other is a reduced model without bias parameters.

  16. Angular Motion Estimation Using Dynamic Models in a Gyro-Free Inertial Measurement Unit

    PubMed Central

    Edwan, Ezzaldeen; Knedlik, Stefan; Loffeld, Otmar

    2012-01-01

    In this paper, we summarize the results of using dynamic models borrowed from tracking theory in describing the time evolution of the state vector to have an estimate of the angular motion in a gyro-free inertial measurement unit (GF-IMU). The GF-IMU is a special type inertial measurement unit (IMU) that uses only a set of accelerometers in inferring the angular motion. Using distributed accelerometers, we get an angular information vector (AIV) composed of angular acceleration and quadratic angular velocity terms. We use a Kalman filter approach to estimate the angular velocity vector since it is not expressed explicitly within the AIV. The bias parameters inherent in the accelerometers measurements' produce a biased AIV and hence the AIV bias parameters are estimated within an augmented state vector. Using dynamic models, the appended bias parameters of the AIV become observable and hence we can have unbiased angular motion estimate. Moreover, a good model is required to extract the maximum amount of information from the observation. Observability analysis is done to determine the conditions for having an observable state space model. For higher grades of accelerometers and under relatively higher sampling frequency, the error of accelerometer measurements is dominated by the noise error. Consequently, simulations are conducted on two models, one has bias parameters appended in the state space model and the other is a reduced model without bias parameters. PMID:22778586

  17. Critical point relascope sampling for unbiased volume estimation of downed coarse woody debris

    Treesearch

    Jeffrey H. Gove; Michael S. Williams; Mark J. Ducey; Mark J. Ducey

    2005-01-01

    Critical point relascope sampling is developed and shown to be design-unbiased for the estimation of log volume when used with point relascope sampling for downed coarse woody debris. The method is closely related to critical height sampling for standing trees when trees are first sampled with a wedge prism. Three alternative protocols for determining the critical...

  18. Type Ia Supernova Intrinsic Magnitude Dispersion and the Fitting of Cosmological Parameters

    NASA Astrophysics Data System (ADS)

    Kim, A. G.

    2011-02-01

    I present an analysis for fitting cosmological parameters from a Hubble diagram of a standard candle with unknown intrinsic magnitude dispersion. The dispersion is determined from the data, simultaneously with the cosmological parameters. This contrasts with the strategies used to date. The advantages of the presented analysis are that it is done in a single fit (it is not iterative), it provides a statistically founded and unbiased estimate of the intrinsic dispersion, and its cosmological-parameter uncertainties account for the intrinsic-dispersion uncertainty. Applied to Type Ia supernovae, my strategy provides a statistical measure to test for subtypes and assess the significance of any magnitude corrections applied to the calibrated candle. Parameter bias and differences between likelihood distributions produced by the presented and currently used fitters are negligibly small for existing and projected supernova data sets.

  19. Independent contrasts and PGLS regression estimators are equivalent.

    PubMed

    Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

    2012-05-01

    We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.

  20. Estimating tag loss of the Atlantic Horseshoe crab, Limulus polyphemus, using a multi-state model

    USGS Publications Warehouse

    Butler, Catherine Alyssa; McGowan, Conor P.; Grand, James B.; Smith, David

    2012-01-01

    The Atlantic Horseshoe crab, Limulus polyphemus, is a valuable resource along the Mid-Atlantic coast which has, in recent years, experienced new management paradigms due to increased concern about this species role in the environment. While current management actions are underway, many acknowledge the need for improved and updated parameter estimates to reduce the uncertainty within the management models. Specifically, updated and improved estimates of demographic parameters such as adult crab survival in the regional population of interest, Delaware Bay, could greatly enhance these models and improve management decisions. There is however, some concern that difficulties in tag resighting or complete loss of tags could be occurring. As apparent from the assumptions of a Jolly-Seber model, loss of tags can result in a biased estimate and underestimate a survival rate. Given that uncertainty, as a first step towards estimating an unbiased estimate of adult survival, we first took steps to estimate the rate of tag loss. Using data from a double tag mark-resight study conducted in Delaware Bay and Program MARK, we designed a multi-state model to allow for the estimation of mortality of each tag separately and simultaneously.

  1. Allowable SEM noise for unbiased LER measurement

    NASA Astrophysics Data System (ADS)

    Papavieros, George; Constantoudis, Vassilios; Gogolides, Evangelos

    2018-03-01

    Recently, a novel method for the calculation of unbiased Line Edge Roughness based on Power Spectral Density analysis has been proposed. In this paper first an alternative method is discussed and investigated, utilizing the Height-Height Correlation Function (HHCF) of edges. The HHCF-based method enables the unbiased determination of the whole triplet of LER parameters including besides rms the correlation length and roughness exponent. The key of both methods is the sensitivity of PSD and HHCF on noise at high frequencies and short distance respectively. Secondly, we elaborate a testbed of synthesized SEM images with controlled LER and noise to justify the effectiveness of the proposed unbiased methods. Our main objective is to find out the boundaries of the method in respect to noise levels and roughness characteristics, for which the method remains reliable, i.e the maximum amount of noise allowed, for which the output results cope with the controllable known inputs. At the same time, we will also set the extremes of roughness parameters for which the methods hold their accuracy.

  2. Blind Source Parameters for Performance Evaluation of Despeckling Filters.

    PubMed

    Biradar, Nagashettappa; Dewal, M L; Rohit, ManojKumar; Gowre, Sanjaykumar; Gundge, Yogesh

    2016-01-01

    The speckle noise is inherent to transthoracic echocardiographic images. A standard noise-free reference echocardiographic image does not exist. The evaluation of filters based on the traditional parameters such as peak signal-to-noise ratio, mean square error, and structural similarity index may not reflect the true filter performance on echocardiographic images. Therefore, the performance of despeckling can be evaluated using blind assessment metrics like the speckle suppression index, speckle suppression and mean preservation index (SMPI), and beta metric. The need for noise-free reference image is overcome using these three parameters. This paper presents a comprehensive analysis and evaluation of eleven types of despeckling filters for echocardiographic images in terms of blind and traditional performance parameters along with clinical validation. The noise is effectively suppressed using the logarithmic neighborhood shrinkage (NeighShrink) embedded with Stein's unbiased risk estimation (SURE). The SMPI is three times more effective compared to the wavelet based generalized likelihood estimation approach. The quantitative evaluation and clinical validation reveal that the filters such as the nonlocal mean, posterior sampling based Bayesian estimation, hybrid median, and probabilistic patch based filters are acceptable whereas median, anisotropic diffusion, fuzzy, and Ripplet nonlinear approximation filters have limited applications for echocardiographic images.

  3. Blind Source Parameters for Performance Evaluation of Despeckling Filters

    PubMed Central

    Biradar, Nagashettappa; Dewal, M. L.; Rohit, ManojKumar; Gowre, Sanjaykumar; Gundge, Yogesh

    2016-01-01

    The speckle noise is inherent to transthoracic echocardiographic images. A standard noise-free reference echocardiographic image does not exist. The evaluation of filters based on the traditional parameters such as peak signal-to-noise ratio, mean square error, and structural similarity index may not reflect the true filter performance on echocardiographic images. Therefore, the performance of despeckling can be evaluated using blind assessment metrics like the speckle suppression index, speckle suppression and mean preservation index (SMPI), and beta metric. The need for noise-free reference image is overcome using these three parameters. This paper presents a comprehensive analysis and evaluation of eleven types of despeckling filters for echocardiographic images in terms of blind and traditional performance parameters along with clinical validation. The noise is effectively suppressed using the logarithmic neighborhood shrinkage (NeighShrink) embedded with Stein's unbiased risk estimation (SURE). The SMPI is three times more effective compared to the wavelet based generalized likelihood estimation approach. The quantitative evaluation and clinical validation reveal that the filters such as the nonlocal mean, posterior sampling based Bayesian estimation, hybrid median, and probabilistic patch based filters are acceptable whereas median, anisotropic diffusion, fuzzy, and Ripplet nonlinear approximation filters have limited applications for echocardiographic images. PMID:27298618

  4. Parameter estimation for groundwater models under uncertain irrigation data

    USGS Publications Warehouse

    Demissie, Yonas; Valocchi, Albert J.; Cai, Ximing; Brozovic, Nicholas; Senay, Gabriel; Gebremichael, Mekonnen

    2015-01-01

    The success of modeling groundwater is strongly influenced by the accuracy of the model parameters that are used to characterize the subsurface system. However, the presence of uncertainty and possibly bias in groundwater model source/sink terms may lead to biased estimates of model parameters and model predictions when the standard regression-based inverse modeling techniques are used. This study first quantifies the levels of bias in groundwater model parameters and predictions due to the presence of errors in irrigation data. Then, a new inverse modeling technique called input uncertainty weighted least-squares (IUWLS) is presented for unbiased estimation of the parameters when pumping and other source/sink data are uncertain. The approach uses the concept of generalized least-squares method with the weight of the objective function depending on the level of pumping uncertainty and iteratively adjusted during the parameter optimization process. We have conducted both analytical and numerical experiments, using irrigation pumping data from the Republican River Basin in Nebraska, to evaluate the performance of ordinary least-squares (OLS) and IUWLS calibration methods under different levels of uncertainty of irrigation data and calibration conditions. The result from the OLS method shows the presence of statistically significant (p < 0.05) bias in estimated parameters and model predictions that persist despite calibrating the models to different calibration data and sample sizes. However, by directly accounting for the irrigation pumping uncertainties during the calibration procedures, the proposed IUWLS is able to minimize the bias effectively without adding significant computational burden to the calibration processes.

  5. Estimating the Probability of Rare Events Occurring Using a Local Model Averaging.

    PubMed

    Chen, Jin-Hua; Chen, Chun-Shu; Huang, Meng-Fan; Lin, Hung-Chih

    2016-10-01

    In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed. © 2016 Society for Risk Analysis.

  6. On the use and misuse of scalar scores of confounders in design and analysis of observational studies.

    PubMed

    Pfeiffer, R M; Riedl, R

    2015-08-15

    We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  7. Predicting minimum uncertainties in the inversion of ocean color geophysical parameters based on Cramer-Rao bounds.

    PubMed

    Jay, Sylvain; Guillaume, Mireille; Chami, Malik; Minghelli, Audrey; Deville, Yannick; Lafrance, Bruno; Serfaty, Véronique

    2018-01-22

    We present an analytical approach based on Cramer-Rao Bounds (CRBs) to investigate the uncertainties in estimated ocean color parameters resulting from the propagation of uncertainties in the bio-optical reflectance modeling through the inversion process. Based on given bio-optical and noise probabilistic models, CRBs can be computed efficiently for any set of ocean color parameters and any sensor configuration, directly providing the minimum estimation variance that can be possibly attained by any unbiased estimator of any targeted parameter. Here, CRBs are explicitly developed using (1) two water reflectance models corresponding to deep and shallow waters, resp., and (2) four probabilistic models describing the environmental noises observed within four Sentinel-2 MSI, HICO, Sentinel-3 OLCI and MODIS images, resp. For both deep and shallow waters, CRBs are shown to be consistent with the experimental estimation variances obtained using two published remote-sensing methods, while not requiring one to perform any inversion. CRBs are also used to investigate to what extent perfect a priori knowledge on one or several geophysical parameters can improve the estimation of remaining unknown parameters. For example, using pre-existing knowledge of bathymetry (e.g., derived from LiDAR) within the inversion is shown to greatly improve the retrieval of bottom cover for shallow waters. Finally, CRBs are shown to provide valuable information on the best estimation performances that may be achieved with the MSI, HICO, OLCI and MODIS configurations for a variety of oceanic, coastal and inland waters. CRBs are thus demonstrated to be an informative and efficient tool to characterize minimum uncertainties in inverted ocean color geophysical parameters.

  8. Responder analysis without dichotomization.

    PubMed

    Zhang, Zhiwei; Chu, Jianxiong; Rahardja, Dewi; Zhang, Hui; Tang, Li

    2016-01-01

    In clinical trials, it is common practice to categorize subjects as responders and non-responders on the basis of one or more clinical measurements under pre-specified rules. Such a responder analysis is often criticized for the loss of information in dichotomizing one or more continuous or ordinal variables. It is worth noting that a responder analysis can be performed without dichotomization, because the proportion of responders for each treatment can be derived from a model for the original clinical variables (used to define a responder) and estimated by substituting maximum likelihood estimators of model parameters. This model-based approach can be considerably more efficient and more effective for dealing with missing data than the usual approach based on dichotomization. For parameter estimation, the model-based approach generally requires correct specification of the model for the original variables. However, under the sharp null hypothesis, the model-based approach remains unbiased for estimating the treatment difference even if the model is misspecified. We elaborate on these points and illustrate them with a series of simulation studies mimicking a study of Parkinson's disease, which involves longitudinal continuous data in the definition of a responder.

  9. An Analysis Method for Superconducting Resonator Parameter Extraction with Complex Baseline Removal

    NASA Technical Reports Server (NTRS)

    Cataldo, Giuseppe

    2014-01-01

    A new semi-empirical model is proposed for extracting the quality (Q) factors of arrays of superconducting microwave kinetic inductance detectors (MKIDs). The determination of the total internal and coupling Q factors enables the computation of the loss in the superconducting transmission lines. The method used allows the simultaneous analysis of multiple interacting discrete resonators with the presence of a complex spectral baseline arising from reflections in the system. The baseline removal allows an unbiased estimate of the device response as measured in a cryogenic instrumentation setting.

  10. Assessing the likely value of gravity and drawdown measurements to constrain estimates of hydraulic conductivity and specific yield during unconfined aquifer testing

    USGS Publications Warehouse

    Blainey, Joan B.; Ferré, Ty P.A.; Cordova, Jeffrey T.

    2007-01-01

    Pumping of an unconfined aquifer can cause local desaturation detectable with high‐resolution gravimetry. A previous study showed that signal‐to‐noise ratios could be predicted for gravity measurements based on a hydrologic model. We show that although changes should be detectable with gravimeters, estimations of hydraulic conductivity and specific yield based on gravity data alone are likely to be unacceptably inaccurate and imprecise. In contrast, a transect of low‐quality drawdown data alone resulted in accurate estimates of hydraulic conductivity and inaccurate and imprecise estimates of specific yield. Combined use of drawdown and gravity data, or use of high‐quality drawdown data alone, resulted in unbiased and precise estimates of both parameters. This study is an example of the value of a staged assessment regarding the likely significance of a new measurement method or monitoring scenario before collecting field data.

  11. Parameter Estimation as a Problem in Statistical Thermodynamics.

    PubMed

    Earle, Keith A; Schneider, David J

    2011-03-14

    In this work, we explore the connections between parameter fitting and statistical thermodynamics using the maxent principle of Jaynes as a starting point. In particular, we show how signal averaging may be described by a suitable one particle partition function, modified for the case of a variable number of particles. These modifications lead to an entropy that is extensive in the number of measurements in the average. Systematic error may be interpreted as a departure from ideal gas behavior. In addition, we show how to combine measurements from different experiments in an unbiased way in order to maximize the entropy of simultaneous parameter fitting. We suggest that fit parameters may be interpreted as generalized coordinates and the forces conjugate to them may be derived from the system partition function. From this perspective, the parameter fitting problem may be interpreted as a process where the system (spectrum) does work against internal stresses (non-optimum model parameters) to achieve a state of minimum free energy/maximum entropy. Finally, we show how the distribution function allows us to define a geometry on parameter space, building on previous work[1, 2]. This geometry has implications for error estimation and we outline a program for incorporating these geometrical insights into an automated parameter fitting algorithm.

  12. Estimating thermal performance curves from repeated field observations

    USGS Publications Warehouse

    Childress, Evan; Letcher, Benjamin H.

    2017-01-01

    Estimating thermal performance of organisms is critical for understanding population distributions and dynamics and predicting responses to climate change. Typically, performance curves are estimated using laboratory studies to isolate temperature effects, but other abiotic and biotic factors influence temperature-performance relationships in nature reducing these models' predictive ability. We present a model for estimating thermal performance curves from repeated field observations that includes environmental and individual variation. We fit the model in a Bayesian framework using MCMC sampling, which allowed for estimation of unobserved latent growth while propagating uncertainty. Fitting the model to simulated data varying in sampling design and parameter values demonstrated that the parameter estimates were accurate, precise, and unbiased. Fitting the model to individual growth data from wild trout revealed high out-of-sample predictive ability relative to laboratory-derived models, which produced more biased predictions for field performance. The field-based estimates of thermal maxima were lower than those based on laboratory studies. Under warming temperature scenarios, field-derived performance models predicted stronger declines in body size than laboratory-derived models, suggesting that laboratory-based models may underestimate climate change effects. The presented model estimates true, realized field performance, avoiding assumptions required for applying laboratory-based models to field performance, which should improve estimates of performance under climate change and advance thermal ecology.

  13. Spatial distribution, sampling precision and survey design optimisation with non-normal variables: The case of anchovy (Engraulis encrasicolus) recruitment in Spanish Mediterranean waters

    NASA Astrophysics Data System (ADS)

    Tugores, M. Pilar; Iglesias, Magdalena; Oñate, Dolores; Miquel, Joan

    2016-02-01

    In the Mediterranean Sea, the European anchovy (Engraulis encrasicolus) displays a key role in ecological and economical terms. Ensuring stock sustainability requires the provision of crucial information, such as species spatial distribution or unbiased abundance and precision estimates, so that management strategies can be defined (e.g. fishing quotas, temporal closure areas or marine protected areas MPA). Furthermore, the estimation of the precision of global abundance at different sampling intensities can be used for survey design optimisation. Geostatistics provide a priori unbiased estimations of the spatial structure, global abundance and precision for autocorrelated data. However, their application to non-Gaussian data introduces difficulties in the analysis in conjunction with low robustness or unbiasedness. The present study applied intrinsic geostatistics in two dimensions in order to (i) analyse the spatial distribution of anchovy in Spanish Western Mediterranean waters during the species' recruitment season, (ii) produce distribution maps, (iii) estimate global abundance and its precision, (iv) analyse the effect of changing the sampling intensity on the precision of global abundance estimates and, (v) evaluate the effects of several methodological options on the robustness of all the analysed parameters. The results suggested that while the spatial structure was usually non-robust to the tested methodological options when working with the original dataset, it became more robust for the transformed datasets (especially for the log-backtransformed dataset). The global abundance was always highly robust and the global precision was highly or moderately robust to most of the methodological options, except for data transformation.

  14. An examination of effect estimation in factorial and standardly-tailored designs

    PubMed Central

    Allore, Heather G; Murphy, Terrence E

    2012-01-01

    Background Many clinical trials are designed to test an intervention arm against a control arm wherein all subjects are equally eligible for all interventional components. Factorial designs have extended this to test multiple intervention components and their interactions. A newer design referred to as a ‘standardly-tailored’ design, is a multicomponent interventional trial that applies individual interventional components to modify risk factors identified a priori and tests whether health outcomes differ between treatment arms. Standardly-tailored designs do not require that all subjects be eligible for every interventional component. Although standardly-tailored designs yield an estimate for the net effect of the multicomponent intervention, it has not yet been shown if they permit separate, unbiased estimation of individual component effects. The ability to estimate the most potent interventional components has direct bearing on conducting second stage translational research. Purpose We present statistical issues related to the estimation of individual component effects in trials of geriatric conditions using factorial and standardly-tailored designs. The medical community is interested in second stage translational research involving the transfer of results from a randomized clinical trial to a community setting. Before such research is undertaken, main effects and synergistic and or antagonistic interactions between them should be identified. Knowledge of the relative strength and direction of the effects of the individual components and their interactions facilitates the successful transfer of clinically significant findings and may potentially reduce the number of interventional components needed. Therefore the current inability of the standardly-tailored design to provide unbiased estimates of individual interventional components is a serious limitation in their applicability to second stage translational research. Methods We discuss estimation of individual component effects from the family of factorial designs and this limitation for standardly-tailored designs. We use the phrase ‘factorial designs’ to describe full-factorial designs and their derivatives including the fractional factorial, partial factorial, incomplete factorial and modified reciprocal designs. We suggest two potential directions for designing multicomponent interventions to facilitate unbiased estimates of individual interventional components. Results Full factorial designs and their variants are the most common multicomponent trial design described in the literature and differ meaningfully from standardly-tailored designs. Factorial and standardly-tailored designs result in similar estimates of net effect with different levels of precision. Unbiased estimation of individual component effects from a standardly-tailored design will require new methodology. Limitations Although clinically relevant in geriatrics, previous applications of standardly-tailored designs have not provided unbiased estimates of the effects of individual interventional components. Discussion Future directions to estimate individual component effects from standardly-tailored designs include applying D-optimal designs and creating independent linear combinations of risk factors analogous to factor analysis. Conclusion Methods are needed to extract unbiased estimates of the effects of individual interventional components from standardly-tailored designs. PMID:18375650

  15. Probability Theory Plus Noise: Descriptive Estimation and Inferential Judgment.

    PubMed

    Costello, Fintan; Watts, Paul

    2018-01-01

    We describe a computational model of two central aspects of people's probabilistic reasoning: descriptive probability estimation and inferential probability judgment. This model assumes that people's reasoning follows standard frequentist probability theory, but it is subject to random noise. This random noise has a regressive effect in descriptive probability estimation, moving probability estimates away from normative probabilities and toward the center of the probability scale. This random noise has an anti-regressive effect in inferential judgement, however. These regressive and anti-regressive effects explain various reliable and systematic biases seen in people's descriptive probability estimation and inferential probability judgment. This model predicts that these contrary effects will tend to cancel out in tasks that involve both descriptive estimation and inferential judgement, leading to unbiased responses in those tasks. We test this model by applying it to one such task, described by Gallistel et al. ). Participants' median responses in this task were unbiased, agreeing with normative probability theory over the full range of responses. Our model captures the pattern of unbiased responses in this task, while simultaneously explaining systematic biases away from normatively correct probabilities seen in other tasks. Copyright © 2018 Cognitive Science Society, Inc.

  16. A modified weighted function method for parameter estimation of Pearson type three distribution

    NASA Astrophysics Data System (ADS)

    Liang, Zhongmin; Hu, Yiming; Li, Binquan; Yu, Zhongbo

    2014-04-01

    In this paper, an unconventional method called Modified Weighted Function (MWF) is presented for the conventional moment estimation of a probability distribution function. The aim of MWF is to estimate the coefficient of variation (CV) and coefficient of skewness (CS) from the original higher moment computations to the first-order moment calculations. The estimators for CV and CS of Pearson type three distribution function (PE3) were derived by weighting the moments of the distribution with two weight functions, which were constructed by combining two negative exponential-type functions. The selection of these weight functions was based on two considerations: (1) to relate weight functions to sample size in order to reflect the relationship between the quantity of sample information and the role of weight function and (2) to allocate more weights to data close to medium-tail positions in a sample series ranked in an ascending order. A Monte-Carlo experiment was conducted to simulate a large number of samples upon which statistical properties of MWF were investigated. For the PE3 parent distribution, results of MWF were compared to those of the original Weighted Function (WF) and Linear Moments (L-M). The results indicate that MWF was superior to WF and slightly better than L-M, in terms of statistical unbiasness and effectiveness. In addition, the robustness of MWF, WF, and L-M were compared by designing the Monte-Carlo experiment that samples are obtained from Log-Pearson type three distribution (LPE3), three parameter Log-Normal distribution (LN3), and Generalized Extreme Value distribution (GEV), respectively, but all used as samples from the PE3 distribution. The results show that in terms of statistical unbiasness, no one method possesses the absolutely overwhelming advantage among MWF, WF, and L-M, while in terms of statistical effectiveness, the MWF is superior to WF and L-M.

  17. Targeted estimation of nuisance parameters to obtain valid statistical inference.

    PubMed

    van der Laan, Mark J

    2014-01-01

    In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special case, we also demonstrate the required targeting of the propensity score for the inverse probability of treatment weighted estimator using super-learning to fit the propensity score.

  18. Framework for making better predictions by directly estimating variables' predictivity.

    PubMed

    Lo, Adeline; Chernoff, Herman; Zheng, Tian; Lo, Shaw-Hwa

    2016-12-13

    We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to noisy useless variables. We demonstrate that the [Formula: see text]-score of the PR method of VS yields a relatively unbiased estimate of a parameter that is not sensitive to noisy variables and is a lower bound to the parameter of interest. Thus, the PR method using the [Formula: see text]-score provides an effective approach to selecting highly predictive variables. We offer simulations and an application of the [Formula: see text]-score on real data to demonstrate the statistic's predictive performance on sample data. We conjecture that using the partition retention and [Formula: see text]-score can aid in finding variable sets with promising prediction rates; however, further research in the avenue of sample-based measures of predictivity is much desired.

  19. Framework for making better predictions by directly estimating variables’ predictivity

    PubMed Central

    Chernoff, Herman; Lo, Shaw-Hwa

    2016-01-01

    We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to noisy useless variables. We demonstrate that the I-score of the PR method of VS yields a relatively unbiased estimate of a parameter that is not sensitive to noisy variables and is a lower bound to the parameter of interest. Thus, the PR method using the I-score provides an effective approach to selecting highly predictive variables. We offer simulations and an application of the I-score on real data to demonstrate the statistic’s predictive performance on sample data. We conjecture that using the partition retention and I-score can aid in finding variable sets with promising prediction rates; however, further research in the avenue of sample-based measures of predictivity is much desired. PMID:27911830

  20. Modeling structured population dynamics using data from unmarked individuals

    USGS Publications Warehouse

    Grant, Evan H. Campbell; Zipkin, Elise; Thorson, James T.; See, Kevin; Lynch, Heather J.; Kanno, Yoichiro; Chandler, Richard; Letcher, Benjamin H.; Royle, J. Andrew

    2014-01-01

    The study of population dynamics requires unbiased, precise estimates of abundance and vital rates that account for the demographic structure inherent in all wildlife and plant populations. Traditionally, these estimates have only been available through approaches that rely on intensive mark–recapture data. We extended recently developed N-mixture models to demonstrate how demographic parameters and abundance can be estimated for structured populations using only stage-structured count data. Our modeling framework can be used to make reliable inferences on abundance as well as recruitment, immigration, stage-specific survival, and detection rates during sampling. We present a range of simulations to illustrate the data requirements, including the number of years and locations necessary for accurate and precise parameter estimates. We apply our modeling framework to a population of northern dusky salamanders (Desmognathus fuscus) in the mid-Atlantic region (USA) and find that the population is unexpectedly declining. Our approach represents a valuable advance in the estimation of population dynamics using multistate data from unmarked individuals and should additionally be useful in the development of integrated models that combine data from intensive (e.g., mark–recapture) and extensive (e.g., counts) data sources.

  1. Flat-Sky Pseudo-Cls Analysis for Weak Gravitational Lensing

    NASA Astrophysics Data System (ADS)

    Asgari, Marika; Taylor, Andy; Joachimi, Benjamin; Kitching, Thomas D.

    2018-05-01

    We investigate the use of estimators of weak lensing power spectra based on a flat-sky implementation of the 'Pseudo-CI' (PCl) technique, where the masked shear field is transformed without regard for masked regions of sky. This masking mixes power, and 'E'-convergence and 'B'-modes. To study the accuracy of forward-modelling and full-sky power spectrum recovery we consider both large-area survey geometries, and small-scale masking due to stars and a checkerboard model for field-of-view gaps. The power spectrum for the large-area survey geometry is sparsely-sampled and highly oscillatory, which makes modelling problematic. Instead, we derive an overall calibration for large-area mask bias using simulated fields. The effects of small-area star masks can be accurately corrected for, while the checkerboard mask has oscillatory and spiky behaviour which leads to percent biases. Apodisation of the masked fields leads to increased biases and a loss of information. We find that we can construct an unbiased forward-model of the raw PCls, and recover the full-sky convergence power to within a few percent accuracy for both Gaussian and lognormal-distributed shear fields. Propagating this through to cosmological parameters using a Fisher-Matrix formalism, we find we can make unbiased estimates of parameters for surveys up to 1,200 deg2 with 30 galaxies per arcmin2, beyond which the percent biases become larger than the statistical accuracy. This implies a flat-sky PCl analysis is accurate for current surveys but a Euclid-like survey will require higher accuracy.

  2. Species richness in soil bacterial communities: a proposed approach to overcome sample size bias.

    PubMed

    Youssef, Noha H; Elshahed, Mostafa S

    2008-09-01

    Estimates of species richness based on 16S rRNA gene clone libraries are increasingly utilized to gauge the level of bacterial diversity within various ecosystems. However, previous studies have indicated that regardless of the utilized approach, species richness estimates obtained are dependent on the size of the analyzed clone libraries. We here propose an approach to overcome sample size bias in species richness estimates in complex microbial communities. Parametric (Maximum likelihood-based and rarefaction curve-based) and non-parametric approaches were used to estimate species richness in a library of 13,001 near full-length 16S rRNA clones derived from soil, as well as in multiple subsets of the original library. Species richness estimates obtained increased with the increase in library size. To obtain a sample size-unbiased estimate of species richness, we calculated the theoretical clone library sizes required to encounter the estimated species richness at various clone library sizes, used curve fitting to determine the theoretical clone library size required to encounter the "true" species richness, and subsequently determined the corresponding sample size-unbiased species richness value. Using this approach, sample size-unbiased estimates of 17,230, 15,571, and 33,912 were obtained for the ML-based, rarefaction curve-based, and ACE-1 estimators, respectively, compared to bias-uncorrected values of 15,009, 11,913, and 20,909.

  3. Analytical performance evaluation of SAR ATR with inaccurate or estimated models

    NASA Astrophysics Data System (ADS)

    DeVore, Michael D.

    2004-09-01

    Hypothesis testing algorithms for automatic target recognition (ATR) are often formulated in terms of some assumed distribution family. The parameter values corresponding to a particular target class together with the distribution family constitute a model for the target's signature. In practice such models exhibit inaccuracy because of incorrect assumptions about the distribution family and/or because of errors in the assumed parameter values, which are often determined experimentally. Model inaccuracy can have a significant impact on performance predictions for target recognition systems. Such inaccuracy often causes model-based predictions that ignore the difference between assumed and actual distributions to be overly optimistic. This paper reports on research to quantify the effect of inaccurate models on performance prediction and to estimate the effect using only trained parameters. We demonstrate that for large observation vectors the class-conditional probabilities of error can be expressed as a simple function of the difference between two relative entropies. These relative entropies quantify the discrepancies between the actual and assumed distributions and can be used to express the difference between actual and predicted error rates. Focusing on the problem of ATR from synthetic aperture radar (SAR) imagery, we present estimators of the probabilities of error in both ideal and plug-in tests expressed in terms of the trained model parameters. These estimators are defined in terms of unbiased estimates for the first two moments of the sample statistic. We present an analytical treatment of these results and include demonstrations from simulated radar data.

  4. Unbiased Estimation of Refractive State of Aberrated Eyes

    PubMed Central

    Martin, Jesson; Vasudevan, Balamurali; Himebaugh, Nikole; Bradley, Arthur; Thibos, Larry

    2011-01-01

    To identify unbiased methods for estimating the target vergence required to maximize visual acuity based on wavefront aberration measurements. Experiments were designed to minimize the impact of confounding factors that have hampered previous research. Objective wavefront refractions and subjective acuity refractions were obtained for the same monochromatic wavelength. Accommodation and pupil fluctuations were eliminated by cycloplegia. Unbiased subjective refractions that maximize visual acuity for high contrast letters were performed with a computer controlled forced choice staircase procedure, using 0.125 diopter steps of defocus. All experiments were performed for two pupil diameters (3mm and 6mm). As reported in the literature, subjective refractive error does not change appreciably when the pupil dilates. For 3 mm pupils most metrics yielded objective refractions that were about 0.1D more hyperopic than subjective acuity refractions. When pupil diameter increased to 6 mm, this bias changed in the myopic direction and the variability between metrics also increased. These inaccuracies were small compared to the precision of the measurements, which implies that most metrics provided unbiased estimates of refractive state for medium and large pupils. A variety of image quality metrics may be used to determine ocular refractive state for monochromatic (635nm) light, thereby achieving accurate results without the need for empirical correction factors. PMID:21777601

  5. Predicting longitudinal trajectories of health probabilities with random-effects multinomial logit regression.

    PubMed

    Liu, Xian; Engel, Charles C

    2012-12-20

    Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.

  6. Simplified Estimation and Testing in Unbalanced Repeated Measures Designs.

    PubMed

    Spiess, Martin; Jordan, Pascal; Wendt, Mike

    2018-05-07

    In this paper we propose a simple estimator for unbalanced repeated measures design models where each unit is observed at least once in each cell of the experimental design. The estimator does not require a model of the error covariance structure. Thus, circularity of the error covariance matrix and estimation of correlation parameters and variances are not necessary. Together with a weak assumption about the reason for the varying number of observations, the proposed estimator and its variance estimator are unbiased. As an alternative to confidence intervals based on the normality assumption, a bias-corrected and accelerated bootstrap technique is considered. We also propose the naive percentile bootstrap for Wald-type tests where the standard Wald test may break down when the number of observations is small relative to the number of parameters to be estimated. In a simulation study we illustrate the properties of the estimator and the bootstrap techniques to calculate confidence intervals and conduct hypothesis tests in small and large samples under normality and non-normality of the errors. The results imply that the simple estimator is only slightly less efficient than an estimator that correctly assumes a block structure of the error correlation matrix, a special case of which is an equi-correlation matrix. Application of the estimator and the bootstrap technique is illustrated using data from a task switch experiment based on an experimental within design with 32 cells and 33 participants.

  7. Detection of sea otters in boat-based surveys of Prince William Sound, Alaska

    USGS Publications Warehouse

    Udevitz, Mark S.; Bodkin, James L.; Costa, Daniel P.

    1995-01-01

    Boat-based surveys have been commonly used to monitor sea otter populations, but there has been little quantitative work to evaluate detection biases that may affect these surveys. We used ground-based observers to investigate sea otter detection probabilities in a boat-based survey of Prince William Sound, Alaska. We estimated that 30% of the otters present on surveyed transects were not detected by boat crews. Approximately half (53%) of the undetected otters were missed because the otters left the transects, apparently in response to the approaching boat. Unbiased estimates of detection probabilities will be required for obtaining unbiased population estimates from boat-based surveys of sea otters. Therefore, boat-based surveys should include methods to estimate sea otter detection probabilities under the conditions specific to each survey. Unbiased estimation of detection probabilities with ground-based observers requires either that the ground crews detect all of the otters in observed subunits, or that there are no errors in determining which crews saw each detected otter. Ground-based observer methods may be appropriate in areas where nearly all of the sea otter habitat is potentially visible from ground-based vantage points.

  8. Estimating Unbiased Land Cover Change Areas In The Colombian Amazon Using Landsat Time Series And Statistical Inference Methods

    NASA Astrophysics Data System (ADS)

    Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.

    2017-12-01

    Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.

  9. Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design

    ERIC Educational Resources Information Center

    Smith, William C.

    2014-01-01

    The ability of regression discontinuity (RD) designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs) make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education…

  10. Unbalanced and Minimal Point Equivalent Estimation Second-Order Split-Plot Designs

    NASA Technical Reports Server (NTRS)

    Parker, Peter A.; Kowalski, Scott M.; Vining, G. Geoffrey

    2007-01-01

    Restricting the randomization of hard-to-change factors in industrial experiments is often performed by employing a split-plot design structure. From an economic perspective, these designs minimize the experimental cost by reducing the number of resets of the hard-to- change factors. In this paper, unbalanced designs are considered for cases where the subplots are relatively expensive and the experimental apparatus accommodates an unequal number of runs per whole-plot. We provide construction methods for unbalanced second-order split- plot designs that possess the equivalence estimation optimality property, providing best linear unbiased estimates of the parameters; independent of the variance components. Unbalanced versions of the central composite and Box-Behnken designs are developed. For cases where the subplot cost approaches the whole-plot cost, minimal point designs are proposed and illustrated with a split-plot Notz design.

  11. Non-Cartesian MRI Reconstruction With Automatic Regularization Via Monte-Carlo SURE

    PubMed Central

    Weller, Daniel S.; Nielsen, Jon-Fredrik; Fessler, Jeffrey A.

    2013-01-01

    Magnetic resonance image (MRI) reconstruction from undersampled k-space data requires regularization to reduce noise and aliasing artifacts. Proper application of regularization however requires appropriate selection of associated regularization parameters. In this work, we develop a data-driven regularization parameter adjustment scheme that minimizes an estimate (based on the principle of Stein’s unbiased risk estimate—SURE) of a suitable weighted squared-error measure in k-space. To compute this SURE-type estimate, we propose a Monte-Carlo scheme that extends our previous approach to inverse problems (e.g., MRI reconstruction) involving complex-valued images. Our approach depends only on the output of a given reconstruction algorithm and does not require knowledge of its internal workings, so it is capable of tackling a wide variety of reconstruction algorithms and nonquadratic regularizers including total variation and those based on the ℓ1-norm. Experiments with simulated and real MR data indicate that the proposed approach is capable of providing near mean squared-error (MSE) optimal regularization parameters for single-coil undersampled non-Cartesian MRI reconstruction. PMID:23591478

  12. Parameter Estimation for Thurstone Choice Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vojnovic, Milan; Yun, Seyoung

    We consider the estimation accuracy of individual strength parameters of a Thurstone choice model when each input observation consists of a choice of one item from a set of two or more items (so called top-1 lists). This model accommodates the well-known choice models such as the Luce choice model for comparison sets of two or more items and the Bradley-Terry model for pair comparisons. We provide a tight characterization of the mean squared error of the maximum likelihood parameter estimator. We also provide similar characterizations for parameter estimators defined by a rank-breaking method, which amounts to deducing one ormore » more pair comparisons from a comparison of two or more items, assuming independence of these pair comparisons, and maximizing a likelihood function derived under these assumptions. We also consider a related binary classification problem where each individual parameter takes value from a set of two possible values and the goal is to correctly classify all items within a prescribed classification error. The results of this paper shed light on how the parameter estimation accuracy depends on given Thurstone choice model and the structure of comparison sets. In particular, we found that for unbiased input comparison sets of a given cardinality, when in expectation each comparison set of given cardinality occurs the same number of times, for a broad class of Thurstone choice models, the mean squared error decreases with the cardinality of comparison sets, but only marginally according to a diminishing returns relation. On the other hand, we found that there exist Thurstone choice models for which the mean squared error of the maximum likelihood parameter estimator can decrease much faster with the cardinality of comparison sets. We report empirical evaluation of some claims and key parameters revealed by theory using both synthetic and real-world input data from some popular sport competitions and online labor platforms.« less

  13. Unbiased estimation of chloroplast number in mesophyll cells: advantage of a genuine three-dimensional approach

    PubMed Central

    Kubínová, Zuzana

    2014-01-01

    Chloroplast number per cell is a frequently examined quantitative anatomical parameter, often estimated by counting chloroplast profiles in two-dimensional (2D) sections of mesophyll cells. However, a mesophyll cell is a three-dimensional (3D) structure and this has to be taken into account when quantifying its internal structure. We compared 2D and 3D approaches to chloroplast counting from different points of view: (i) in practical measurements of mesophyll cells of Norway spruce needles, (ii) in a 3D model of a mesophyll cell with chloroplasts, and (iii) using a theoretical analysis. We applied, for the first time, the stereological method of an optical disector based on counting chloroplasts in stacks of spruce needle optical cross-sections acquired by confocal laser-scanning microscopy. This estimate was compared with counting chloroplast profiles in 2D sections from the same stacks of sections. Comparing practical measurements of mesophyll cells, calculations performed in a 3D model of a cell with chloroplasts as well as a theoretical analysis showed that the 2D approach yielded biased results, while the underestimation could be up to 10-fold. We proved that the frequently used method for counting chloroplasts in a mesophyll cell by counting their profiles in 2D sections did not give correct results. We concluded that the present disector method can be efficiently used for unbiased estimation of chloroplast number per mesophyll cell. This should be the method of choice, especially in coniferous needles and leaves with mesophyll cells with lignified cell walls where maceration methods are difficult or impossible to use. PMID:24336344

  14. The Efficiency of Split Panel Designs in an Analysis of Variance Model

    PubMed Central

    Wang, Wei-Guo; Liu, Hai-Jun

    2016-01-01

    We consider split panel design efficiency in analysis of variance models, that is, the determination of the cross-sections series optimal proportion in all samples, to minimize parametric best linear unbiased estimators of linear combination variances. An orthogonal matrix is constructed to obtain manageable expression of variances. On this basis, we derive a theorem for analyzing split panel design efficiency irrespective of interest and budget parameters. Additionally, relative estimator efficiency based on the split panel to an estimator based on a pure panel or a pure cross-section is present. The analysis shows that the gains from split panel can be quite substantial. We further consider the efficiency of split panel design, given a budget, and transform it to a constrained nonlinear integer programming. Specifically, an efficient algorithm is designed to solve the constrained nonlinear integer programming. Moreover, we combine one at time designs and factorial designs to illustrate the algorithm’s efficiency with an empirical example concerning monthly consumer expenditure on food in 1985, in the Netherlands, and the efficient ranges of the algorithm parameters are given to ensure a good solution. PMID:27163447

  15. Optimal Asteroid Mass Determination from Planetary Range Observations: A Study of a Simplified Test Model

    NASA Technical Reports Server (NTRS)

    Kuchynka, P.; Laskar, J.; Fienga, A.

    2011-01-01

    Mars ranging observations are available over the past 10 years with an accuracy of a few meters. Such precise measurements of the Earth-Mars distance provide valuable constraints on the masses of the asteroids perturbing both planets. Today more than 30 asteroid masses have thus been estimated from planetary ranging data (see [1] and [2]). Obtaining unbiased mass estimations is nevertheless difficult. Various systematic errors can be introduced by imperfect reduction of spacecraft tracking observations to planetary ranging data. The large number of asteroids and the limited a priori knowledge of their masses is also an obstacle for parameter selection. Fitting in a model a mass of a negligible perturber, or on the contrary omitting a significant perturber, will induce important bias in determined asteroid masses. In this communication, we investigate a simplified version of the mass determination problem. Instead of planetary ranging observations from spacecraft or radar data, we consider synthetic ranging observations generated with the INPOP [2] ephemeris for a test model containing 25000 asteroids. We then suggest a method for optimal parameter selection and estimation in this simplified framework.

  16. Two-compartment modeling of tissue microcirculation revisited.

    PubMed

    Brix, Gunnar; Salehi Ravesh, Mona; Griebel, Jürgen

    2017-05-01

    Conventional two-compartment modeling of tissue microcirculation is used for tracer kinetic analysis of dynamic contrast-enhanced (DCE) computed tomography or magnetic resonance imaging studies although it is well-known that the underlying assumption of an instantaneous mixing of the administered contrast agent (CA) in capillaries is far from being realistic. It was thus the aim of the present study to provide theoretical and computational evidence in favor of a conceptually alternative modeling approach that makes it possible to characterize the bias inherent to compartment modeling and, moreover, to approximately correct for it. Starting from a two-region distributed-parameter model that accounts for spatial gradients in CA concentrations within blood-tissue exchange units, a modified lumped two-compartment exchange model was derived. It has the same analytical structure as the conventional two-compartment model, but indicates that the apparent blood flow identifiable from measured DCE data is substantially overestimated, whereas the three other model parameters (i.e., the permeability-surface area product as well as the volume fractions of the plasma and interstitial distribution space) are unbiased. Furthermore, a simple formula was derived to approximately compute a bias-corrected flow from the estimates of the apparent flow and permeability-surface area product obtained by model fitting. To evaluate the accuracy of the proposed modeling and bias correction method, representative noise-free DCE curves were analyzed. They were simulated for 36 microcirculation and four input scenarios by an axially distributed reference model. As analytically proven, the considered two-compartment exchange model is structurally identifiable from tissue residue data. The apparent flow values estimated for the 144 simulated tissue/input scenarios were considerably biased. After bias-correction, the deviations between estimated and actual parameter values were (11.2 ± 6.4) % (vs. (105 ± 21) % without correction) for the flow, (3.6 ± 6.1) % for the permeability-surface area product, (5.8 ± 4.9) % for the vascular volume and (2.5 ± 4.1) % for the interstitial volume; with individual deviations of more than 20% being the exception and just marginal. Increasing the duration of CA administration only had a statistically significant but opposite effect on the accuracy of the estimated flow (declined) and intravascular volume (improved). Physiologically well-defined tissue parameters are structurally identifiable and accurately estimable from DCE data by the conceptually modified two-compartment model in combination with the bias correction. The accuracy of the bias-corrected flow is nearly comparable to that of the three other (theoretically unbiased) model parameters. As compared to conventional two-compartment modeling, this feature constitutes a major advantage for tracer kinetic analysis of both preclinical and clinical DCE imaging studies. © 2017 American Association of Physicists in Medicine.

  17. A one-step method for modelling longitudinal data with differential equations.

    PubMed

    Hu, Yueqin; Treinen, Raymond

    2018-04-06

    Differential equation models are frequently used to describe non-linear trajectories of longitudinal data. This study proposes a new approach to estimate the parameters in differential equation models. Instead of estimating derivatives from the observed data first and then fitting a differential equation to the derivatives, our new approach directly fits the analytic solution of a differential equation to the observed data, and therefore simplifies the procedure and avoids bias from derivative estimations. A simulation study indicates that the analytic solutions of differential equations (ASDE) approach obtains unbiased estimates of parameters and their standard errors. Compared with other approaches that estimate derivatives first, ASDE has smaller standard error, larger statistical power and accurate Type I error. Although ASDE obtains biased estimation when the system has sudden phase change, the bias is not serious and a solution is also provided to solve the phase problem. The ASDE method is illustrated and applied to a two-week study on consumers' shopping behaviour after a sale promotion, and to a set of public data tracking participants' grammatical facial expression in sign language. R codes for ASDE, recommendations for sample size and starting values are provided. Limitations and several possible expansions of ASDE are also discussed. © 2018 The British Psychological Society.

  18. On the degrees of freedom of reduced-rank estimators in multivariate regression

    PubMed Central

    Mukherjee, A.; Chen, K.; Wang, N.; Zhu, J.

    2015-01-01

    Summary We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example. PMID:26702155

  19. Data assimilation in integrated hydrological modelling in the presence of observation bias

    NASA Astrophysics Data System (ADS)

    Rasmussen, J.; Madsen, H.; Jensen, K. H.; Refsgaard, J. C.

    2015-08-01

    The use of bias-aware Kalman filters for estimating and correcting observation bias in groundwater head observations is evaluated using both synthetic and real observations. In the synthetic test, groundwater head observations with a constant bias and unbiased stream discharge observations are assimilated in a catchment scale integrated hydrological model with the aim of updating stream discharge and groundwater head, as well as several model parameters relating to both stream flow and groundwater modeling. The Colored Noise Kalman filter (ColKF) and the Separate bias Kalman filter (SepKF) are tested and evaluated for correcting the observation biases. The study found that both methods were able to estimate most of the biases and that using any of the two bias estimation methods resulted in significant improvements over using a bias-unaware Kalman Filter. While the convergence of the ColKF was significantly faster than the convergence of the SepKF, a much larger ensemble size was required as the estimation of biases would otherwise fail. Real observations of groundwater head and stream discharge were also assimilated, resulting in improved stream flow modeling in terms of an increased Nash-Sutcliffe coefficient while no clear improvement in groundwater head modeling was observed. Both the ColKF and the SepKF tended to underestimate the biases, which resulted in drifting model behavior and sub-optimal parameter estimation, but both methods provided better state updating and parameter estimation than using a bias-unaware filter.

  20. Data assimilation in integrated hydrological modelling in the presence of observation bias

    NASA Astrophysics Data System (ADS)

    Rasmussen, Jørn; Madsen, Henrik; Høgh Jensen, Karsten; Refsgaard, Jens Christian

    2016-05-01

    The use of bias-aware Kalman filters for estimating and correcting observation bias in groundwater head observations is evaluated using both synthetic and real observations. In the synthetic test, groundwater head observations with a constant bias and unbiased stream discharge observations are assimilated in a catchment-scale integrated hydrological model with the aim of updating stream discharge and groundwater head, as well as several model parameters relating to both streamflow and groundwater modelling. The coloured noise Kalman filter (ColKF) and the separate-bias Kalman filter (SepKF) are tested and evaluated for correcting the observation biases. The study found that both methods were able to estimate most of the biases and that using any of the two bias estimation methods resulted in significant improvements over using a bias-unaware Kalman filter. While the convergence of the ColKF was significantly faster than the convergence of the SepKF, a much larger ensemble size was required as the estimation of biases would otherwise fail. Real observations of groundwater head and stream discharge were also assimilated, resulting in improved streamflow modelling in terms of an increased Nash-Sutcliffe coefficient while no clear improvement in groundwater head modelling was observed. Both the ColKF and the SepKF tended to underestimate the biases, which resulted in drifting model behaviour and sub-optimal parameter estimation, but both methods provided better state updating and parameter estimation than using a bias-unaware filter.

  1. Decision rules for unbiased inventory estimates

    NASA Technical Reports Server (NTRS)

    Argentiero, P. D.; Koch, D.

    1979-01-01

    An efficient and accurate procedure for estimating inventories from remote sensing scenes is presented. In place of the conventional and expensive full dimensional Bayes decision rule, a one-dimensional feature extraction and classification technique was employed. It is shown that this efficient decision rule can be used to develop unbiased inventory estimates and that for large sample sizes typical of satellite derived remote sensing scenes, resulting accuracies are comparable or superior to more expensive alternative procedures. Mathematical details of the procedure are provided in the body of the report and in the appendix. Results of a numerical simulation of the technique using statistics obtained from an observed LANDSAT scene are included. The simulation demonstrates the effectiveness of the technique in computing accurate inventory estimates.

  2. A weighted least squares estimation of the polynomial regression model on paddy production in the area of Kedah and Perlis

    NASA Astrophysics Data System (ADS)

    Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd

    2017-08-01

    The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.

  3. Joint constraints on galaxy bias and σ{sub 8} through the N-pdf of the galaxy number density

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arnalte-Mur, Pablo; Martínez, Vicent J.; Vielva, Patricio

    We present a full description of the N-probability density function of the galaxy number density fluctuations. This N-pdf is given in terms, on the one hand, of the cold dark matter correlations and, on the other hand, of the galaxy bias parameter. The method relies on the assumption commonly adopted that the dark matter density fluctuations follow a local non-linear transformation of the initial energy density perturbations. The N-pdf of the galaxy number density fluctuations allows for an optimal estimation of the bias parameter (e.g., via maximum-likelihood estimation, or Bayesian inference if there exists any a priori information on themore » bias parameter), and of those parameters defining the dark matter correlations, in particular its amplitude (σ{sub 8}). It also provides the proper framework to perform model selection between two competitive hypotheses. The parameters estimation capabilities of the N-pdf are proved by SDSS-like simulations (both, ideal log-normal simulations and mocks obtained from Las Damas simulations), showing that our estimator is unbiased. We apply our formalism to the 7th release of the SDSS main sample (for a volume-limited subset with absolute magnitudes M{sub r} ≤ −20). We obtain b-circumflex  = 1.193 ± 0.074 and σ-bar{sub 8} = 0.862 ± 0.080, for galaxy number density fluctuations in cells of the size of 30h{sup −1}Mpc. Different model selection criteria show that galaxy biasing is clearly favoured.« less

  4. Statistical guides to estimating the number of undiscovered mineral deposits: an example with porphyry copper deposits

    USGS Publications Warehouse

    Singer, Donald A.; Menzie, W.D.; Cheng, Qiuming; Bonham-Carter, G. F.

    2005-01-01

    Estimating numbers of undiscovered mineral deposits is a fundamental part of assessing mineral resources. Some statistical tools can act as guides to low variance, unbiased estimates of the number of deposits. The primary guide is that the estimates must be consistent with the grade and tonnage models. Another statistical guide is the deposit density (i.e., the number of deposits per unit area of permissive rock in well-explored control areas). Preliminary estimates and confidence limits of the number of undiscovered deposits in a tract of given area may be calculated using linear regression and refined using frequency distributions with appropriate parameters. A Poisson distribution leads to estimates having lower relative variances than the regression estimates and implies a random distribution of deposits. Coefficients of variation are used to compare uncertainties of negative binomial, Poisson, or MARK3 empirical distributions that have the same expected number of deposits as the deposit density. Statistical guides presented here allow simple yet robust estimation of the number of undiscovered deposits in permissive terranes. 

  5. NEW STUDIES OF URBAN FLOOD FREQUENCY IN THE SOUTHEASTERN UNITED STATES.

    USGS Publications Warehouse

    Sauer, Vernon B.

    1986-01-01

    Five reports dealing with flood magnitude and frequency in urban areas in the southeastern United States have been published during the past 2 years by the U. S. Geological Survey (USGS). These reports are based on data collected in Tampa and Tallahassee, Florida; Atlanta, Georgia; and several cities in Alabama and Tennessee. Each report contains regression equations useful for estimating flood peaks for selected recurrence intervals at ungauged urban sites. A nationwide study of urban flood characteristics by the USGS published in 1983 contains equations for estimating urban peak discharges for ungauged sites. At the time that the nationwide study was conducted, data from only 35 sites in the southeastern United States were available. The five new reports contain data for 88 additional sites. These new data show that the seven-parameter estimating equations developed in the nationwide study are unbiased and have prediction errors less than those described in the nationwide report.

  6. Evaluation of selection index: application to the choice of an indirect multitrait selection index for soybean breeding.

    PubMed

    Bouchez, A; Goffinet, B

    1990-02-01

    Selection indices can be used to predict one trait from information available on several traits in order to improve the prediction accuracy. Plant or animal breeders are interested in selecting only the best individuals, and need to compare the efficiency of different trait combinations in order to choose the index ensuring the best prediction quality for individual values. As the usual tools for index evaluation do not remain unbiased in all cases, we propose a robust way of evaluation by means of an estimator of the mean-square error of prediction (EMSEP). This estimator remains valid even when parameters are not known, as usually assumed, but are estimated. EMSEP is applied to the choice of an indirect multitrait selection index at the F5 generation of a classical breeding scheme for soybeans. Best predictions for precocity are obtained by means of indices using only part of the available information.

  7. Limitations to estimating bacterial cross-species transmission using genetic and genomic markers: inferences from simulation modeling

    PubMed Central

    Benavides, Julio A; Cross, Paul C; Luikart, Gordon; Creel, Scott

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced. PMID:25469159

  8. Limitations to estimating bacterial cross-speciestransmission using genetic and genomic markers: inferencesfrom simulation modeling

    USGS Publications Warehouse

    Julio Andre, Benavides; Cross, Paul C.; Luikart, Gordon; Scott, Creel

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced.

  9. Treatment effects model for assessing disease management: measuring outcomes and strengthening program management.

    PubMed

    Wendel, Jeanne; Dumitras, Diana

    2005-06-01

    This paper describes an analytical methodology for obtaining statistically unbiased outcomes estimates for programs in which participation decisions may be correlated with variables that impact outcomes. This methodology is particularly useful for intraorganizational program evaluations conducted for business purposes. In this situation, data is likely to be available for a population of managed care members who are eligible to participate in a disease management (DM) program, with some electing to participate while others eschew the opportunity. The most pragmatic analytical strategy for in-house evaluation of such programs is likely to be the pre-intervention/post-intervention design in which the control group consists of people who were invited to participate in the DM program, but declined the invitation. Regression estimates of program impacts may be statistically biased if factors that impact participation decisions are correlated with outcomes measures. This paper describes an econometric procedure, the Treatment Effects model, developed to produce statistically unbiased estimates of program impacts in this type of situation. Two equations are estimated to (a) estimate the impacts of patient characteristics on decisions to participate in the program, and then (b) use this information to produce a statistically unbiased estimate of the impact of program participation on outcomes. This methodology is well-established in economics and econometrics, but has not been widely applied in the DM outcomes measurement literature; hence, this paper focuses on one illustrative application.

  10. Propensity score analysis with partially observed covariates: How should multiple imputation be used?

    PubMed

    Leyrat, Clémence; Seaman, Shaun R; White, Ian R; Douglas, Ian; Smeeth, Liam; Kim, Joseph; Resche-Rigon, Matthieu; Carpenter, James R; Williamson, Elizabeth J

    2017-01-01

    Inverse probability of treatment weighting is a popular propensity score-based approach to estimate marginal treatment effects in observational studies at risk of confounding bias. A major issue when estimating the propensity score is the presence of partially observed covariates. Multiple imputation is a natural approach to handle missing data on covariates: covariates are imputed and a propensity score analysis is performed in each imputed dataset to estimate the treatment effect. The treatment effect estimates from each imputed dataset are then combined to obtain an overall estimate. We call this method MIte. However, an alternative approach has been proposed, in which the propensity scores are combined across the imputed datasets (MIps). Therefore, there are remaining uncertainties about how to implement multiple imputation for propensity score analysis: (a) should we apply Rubin's rules to the inverse probability of treatment weighting treatment effect estimates or to the propensity score estimates themselves? (b) does the outcome have to be included in the imputation model? (c) how should we estimate the variance of the inverse probability of treatment weighting estimator after multiple imputation? We studied the consistency and balancing properties of the MIte and MIps estimators and performed a simulation study to empirically assess their performance for the analysis of a binary outcome. We also compared the performance of these methods to complete case analysis and the missingness pattern approach, which uses a different propensity score model for each pattern of missingness, and a third multiple imputation approach in which the propensity score parameters are combined rather than the propensity scores themselves (MIpar). Under a missing at random mechanism, complete case and missingness pattern analyses were biased in most cases for estimating the marginal treatment effect, whereas multiple imputation approaches were approximately unbiased as long as the outcome was included in the imputation model. Only MIte was unbiased in all the studied scenarios and Rubin's rules provided good variance estimates for MIte. The propensity score estimated in the MIte approach showed good balancing properties. In conclusion, when using multiple imputation in the inverse probability of treatment weighting context, MIte with the outcome included in the imputation model is the preferred approach.

  11. On the estimability of parameters in undifferenced, uncombined GNSS network and PPP-RTK user models by means of $mathcal {S}$ S -system theory

    NASA Astrophysics Data System (ADS)

    Odijk, Dennis; Zhang, Baocheng; Khodabandeh, Amir; Odolinski, Robert; Teunissen, Peter J. G.

    2016-01-01

    The concept of integer ambiguity resolution-enabled Precise Point Positioning (PPP-RTK) relies on appropriate network information for the parameters that are common between the single-receiver user that applies and the network that provides this information. Most of the current methods for PPP-RTK are based on forming the ionosphere-free combination using dual-frequency Global Navigation Satellite System (GNSS) observations. These methods are therefore restrictive in the light of the development of new multi-frequency GNSS constellations, as well as from the point of view that the PPP-RTK user requires ionospheric corrections to obtain integer ambiguity resolution results based on short observation time spans. The method for PPP-RTK that is presented in this article does not have above limitations as it is based on the undifferenced, uncombined GNSS observation equations, thereby keeping all parameters in the model. Working with the undifferenced observation equations implies that the models are rank-deficient; not all parameters are unbiasedly estimable, but only combinations of them. By application of S-system theory the model is made of full rank by constraining a minimum set of parameters, or S-basis. The choice of this S-basis determines the estimability and the interpretation of the parameters that are transmitted to the PPP-RTK users. As this choice is not unique, one has to be very careful when comparing network solutions in different S-systems; in that case the S-transformation, which is provided by the S-system method, should be used to make the comparison. Knowing the estimability and interpretation of the parameters estimated by the network is shown to be crucial for a correct interpretation of the estimable PPP-RTK user parameters, among others the essential ambiguity parameters, which have the integer property which is clearly following from the interpretation of satellite phase biases from the network. The flexibility of the S-system method is furthermore demonstrated by the fact that all models in this article are derived in multi-epoch mode, allowing to incorporate dynamic model constraints on all or subsets of parameters.

  12. Hidden Markov model for dependent mark loss and survival estimation

    USGS Publications Warehouse

    Laake, Jeffrey L.; Johnson, Devin S.; Diefenbach, Duane R.; Ternent, Mark A.

    2014-01-01

    Mark-recapture estimators assume no loss of marks to provide unbiased estimates of population parameters. We describe a hidden Markov model (HMM) framework that integrates a mark loss model with a Cormack–Jolly–Seber model for survival estimation. Mark loss can be estimated with single-marked animals as long as a sub-sample of animals has a permanent mark. Double-marking provides an estimate of mark loss assuming independence but dependence can be modeled with a permanently marked sub-sample. We use a log-linear approach to include covariates for mark loss and dependence which is more flexible than existing published methods for integrated models. The HMM approach is demonstrated with a dataset of black bears (Ursus americanus) with two ear tags and a subset of which were permanently marked with tattoos. The data were analyzed with and without the tattoo. Dropping the tattoos resulted in estimates of survival that were reduced by 0.005–0.035 due to tag loss dependence that could not be modeled. We also analyzed the data with and without the tattoo using a single tag. By not using.

  13. Bias in error estimation when using cross-validation for model selection.

    PubMed

    Varma, Sudhir; Simon, Richard

    2006-02-23

    Cross-validation (CV) is an effective method for estimating the prediction error of a classifier. Some recent articles have proposed methods for optimizing classifiers by choosing classifier parameter values that minimize the CV error estimate. We have evaluated the validity of using the CV error estimate of the optimized classifier as an estimate of the true error expected on independent data. We used CV to optimize the classification parameters for two kinds of classifiers; Shrunken Centroids and Support Vector Machines (SVM). Random training datasets were created, with no difference in the distribution of the features between the two classes. Using these "null" datasets, we selected classifier parameter values that minimized the CV error estimate. 10-fold CV was used for Shrunken Centroids while Leave-One-Out-CV (LOOCV) was used for the SVM. Independent test data was created to estimate the true error. With "null" and "non null" (with differential expression between the classes) data, we also tested a nested CV procedure, where an inner CV loop is used to perform the tuning of the parameters while an outer CV is used to compute an estimate of the error. The CV error estimate for the classifier with the optimal parameters was found to be a substantially biased estimate of the true error that the classifier would incur on independent data. Even though there is no real difference between the two classes for the "null" datasets, the CV error estimate for the Shrunken Centroid with the optimal parameters was less than 30% on 18.5% of simulated training data-sets. For SVM with optimal parameters the estimated error rate was less than 30% on 38% of "null" data-sets. Performance of the optimized classifiers on the independent test set was no better than chance. The nested CV procedure reduces the bias considerably and gives an estimate of the error that is very close to that obtained on the independent testing set for both Shrunken Centroids and SVM classifiers for "null" and "non-null" data distributions. We show that using CV to compute an error estimate for a classifier that has itself been tuned using CV gives a significantly biased estimate of the true error. Proper use of CV for estimating true error of a classifier developed using a well defined algorithm requires that all steps of the algorithm, including classifier parameter tuning, be repeated in each CV loop. A nested CV procedure provides an almost unbiased estimate of the true error.

  14. Mutually unbiased bases in six dimensions: The four most distant bases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Raynal, Philippe; Lue Xin; Englert, Berthold-Georg

    2011-06-15

    We consider the average distance between four bases in six dimensions. The distance between two orthonormal bases vanishes when the bases are the same, and the distance reaches its maximal value of unity when the bases are unbiased. We perform a numerical search for the maximum average distance and find it to be strictly smaller than unity. This is strong evidence that no four mutually unbiased bases exist in six dimensions. We also provide a two-parameter family of three bases which, together with the canonical basis, reach the numerically found maximum of the average distance, and we conduct a detailedmore » study of the structure of the extremal set of bases.« less

  15. A sampling strategy to estimate the area and perimeter of irregularly shaped planar regions

    Treesearch

    Timothy G. Gregoire; Harry T. Valentine

    1995-01-01

    The length of a randomly oriented ray emanating from an interior point of a planar region can be used to unbiasedly estimate the region's area and perimeter. Estimators and corresponding variance estimators under various selection strategies are presented.

  16. Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables

    PubMed Central

    Rosenblum, Michael; van der Laan, Mark J.

    2010-01-01

    Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636

  17. Validation of abundance estimates from mark–recapture and removal techniques for rainbow trout captured by electrofishing in small streams

    USGS Publications Warehouse

    Rosenberger, Amanda E.; Dunham, Jason B.

    2005-01-01

    Estimation of fish abundance in streams using the removal model or the Lincoln - Peterson mark - recapture model is a common practice in fisheries. These models produce misleading results if their assumptions are violated. We evaluated the assumptions of these two models via electrofishing of rainbow trout Oncorhynchus mykiss in central Idaho streams. For one-, two-, three-, and four-pass sampling effort in closed sites, we evaluated the influences of fish size and habitat characteristics on sampling efficiency and the accuracy of removal abundance estimates. We also examined the use of models to generate unbiased estimates of fish abundance through adjustment of total catch or biased removal estimates. Our results suggested that the assumptions of the mark - recapture model were satisfied and that abundance estimates based on this approach were unbiased. In contrast, the removal model assumptions were not met. Decreasing sampling efficiencies over removal passes resulted in underestimated population sizes and overestimates of sampling efficiency. This bias decreased, but was not eliminated, with increased sampling effort. Biased removal estimates based on different levels of effort were highly correlated with each other but were less correlated with unbiased mark - recapture estimates. Stream size decreased sampling efficiency, and stream size and instream wood increased the negative bias of removal estimates. We found that reliable estimates of population abundance could be obtained from models of sampling efficiency for different levels of effort. Validation of abundance estimates requires extra attention to routine sampling considerations but can help fisheries biologists avoid pitfalls associated with biased data and facilitate standardized comparisons among studies that employ different sampling methods.

  18. The lawful imprecision of human surface tilt estimation in natural scenes

    PubMed Central

    2018-01-01

    Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. PMID:29384477

  19. The lawful imprecision of human surface tilt estimation in natural scenes.

    PubMed

    Kim, Seha; Burge, Johannes

    2018-01-31

    Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. © 2018, Kim et al.

  20. Image denoising in mixed Poisson-Gaussian noise.

    PubMed

    Luisier, Florian; Blu, Thierry; Unser, Michael

    2011-03-01

    We propose a general methodology (PURE-LET) to design and optimize a wide class of transform-domain thresholding algorithms for denoising images corrupted by mixed Poisson-Gaussian noise. We express the denoising process as a linear expansion of thresholds (LET) that we optimize by relying on a purely data-adaptive unbiased estimate of the mean-squared error (MSE), derived in a non-Bayesian framework (PURE: Poisson-Gaussian unbiased risk estimate). We provide a practical approximation of this theoretical MSE estimate for the tractable optimization of arbitrary transform-domain thresholding. We then propose a pointwise estimator for undecimated filterbank transforms, which consists of subband-adaptive thresholding functions with signal-dependent thresholds that are globally optimized in the image domain. We finally demonstrate the potential of the proposed approach through extensive comparisons with state-of-the-art techniques that are specifically tailored to the estimation of Poisson intensities. We also present denoising results obtained on real images of low-count fluorescence microscopy.

  1. Estimating riparian understory vegetation cover with beta regression and copula models

    USGS Publications Warehouse

    Eskelson, Bianca N.I.; Madsen, Lisa; Hagar, Joan C.; Temesgen, Hailemariam

    2011-01-01

    Understory vegetation communities are critical components of forest ecosystems. As a result, the importance of modeling understory vegetation characteristics in forested landscapes has become more apparent. Abundance measures such as shrub cover are bounded between 0 and 1, exhibit heteroscedastic error variance, and are often subject to spatial dependence. These distributional features tend to be ignored when shrub cover data are analyzed. The beta distribution has been used successfully to describe the frequency distribution of vegetation cover. Beta regression models ignoring spatial dependence (BR) and accounting for spatial dependence (BRdep) were used to estimate percent shrub cover as a function of topographic conditions and overstory vegetation structure in riparian zones in western Oregon. The BR models showed poor explanatory power (pseudo-R2 ≤ 0.34) but outperformed ordinary least-squares (OLS) and generalized least-squares (GLS) regression models with logit-transformed response in terms of mean square prediction error and absolute bias. We introduce a copula (COP) model that is based on the beta distribution and accounts for spatial dependence. A simulation study was designed to illustrate the effects of incorrectly assuming normality, equal variance, and spatial independence. It showed that BR, BRdep, and COP models provide unbiased parameter estimates, whereas OLS and GLS models result in slightly biased estimates for two of the three parameters. On the basis of the simulation study, 93–97% of the GLS, BRdep, and COP confidence intervals covered the true parameters, whereas OLS and BR only resulted in 84–88% coverage, which demonstrated the superiority of GLS, BRdep, and COP over OLS and BR models in providing standard errors for the parameter estimates in the presence of spatial dependence.

  2. Poisson sampling - The adjusted and unadjusted estimator revisited

    Treesearch

    Michael S. Williams; Hans T. Schreuder; Gerardo H. Terrazas

    1998-01-01

    The prevailing assumption, that for Poisson sampling the adjusted estimator "Y-hat a" is always substantially more efficient than the unadjusted estimator "Y-hat u" , is shown to be incorrect. Some well known theoretical results are applicable since "Y-hat a" is a ratio-of-means estimator and "Y-hat u" a simple unbiased estimator...

  3. Estimation After a Group Sequential Trial.

    PubMed

    Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert

    2015-10-01

    Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

  4. Estimating total suspended sediment yield with probability sampling

    Treesearch

    Robert B. Thomas

    1985-01-01

    The ""Selection At List Time"" (SALT) scheme controls sampling of concentration for estimating total suspended sediment yield. The probability of taking a sample is proportional to its estimated contribution to total suspended sediment discharge. This procedure gives unbiased estimates of total suspended sediment yield and the variance of the...

  5. Estimation and correction of different flavors of surface observation biases in ensemble Kalman filter

    NASA Astrophysics Data System (ADS)

    Lorente-Plazas, Raquel; Hacker, Josua P.; Collins, Nancy; Lee, Jared A.

    2017-04-01

    The impact of assimilating surface observations has been shown in several publications, for improving weather prediction inside of the boundary layer as well as the flow aloft. However, the assimilation of surface observations is often far from optimal due to the presence of both model and observation biases. The sources of these biases can be diverse: an instrumental offset, errors associated to the comparison of point-based observations and grid-cell average, etc. To overcome this challenge, a method was developed using the ensemble Kalman filter. The approach consists on representing each observation bias as a parameter. These bias parameters are added to the forward operator and they extend the state vector. As opposed to the observation bias estimation approaches most common in operational systems (e.g. for satellite radiances), the state vector and parameters are simultaneously updated by applying the Kalman filter equations to the augmented state. The method to estimate and correct the observation bias is evaluated using observing system simulation experiments (OSSEs) with the Weather Research and Forecasting (WRF) model. OSSEs are constructed for the conventional observation network including radiosondes, aircraft observations, atmospheric motion vectors, and surface observations. Three different kinds of biases are added to 2-meter temperature for synthetic METARs. From the simplest to more sophisticated, imposed biases are: (1) a spatially invariant bias, (2) a spatially varying bias proportional to topographic height differences between the model and the observations, and (3) bias that is proportional to the temperature. The target region characterized by complex terrain is the western U.S. on a domain with 30-km grid spacing. Observations are assimilated every 3 hours using an 80-member ensemble during September 2012. Results demonstrate that the approach is able to estimate and correct the bias when it is spatially invariant (experiment 1). More complex bias structure in experiments (2) and (3) are more difficult to estimate, but still possible. Estimated the parameter in experiments with unbiased observations results in spatial and temporal parameter variability about zero, and establishes a threshold on the accuracy of the parameter in further experiments. When the observations are biased, the mean parameter value is close to the true bias, but temporal and spatial variability in the parameter estimates is similar to the parameters used when estimating a zero bias in the observations. The distributions are related to other errors in the forecasts, indicating that the parameters are absorbing some of the forecast error from other sources. In this presentation we elucidate the reasons for the resulting parameter estimates, and their variability.

  6. Maximum likelihood: Extracting unbiased information from complex networks

    NASA Astrophysics Data System (ADS)

    Garlaschelli, Diego; Loffredo, Maria I.

    2008-07-01

    The choice of free parameters in network models is subjective, since it depends on what topological properties are being monitored. However, we show that the maximum likelihood (ML) principle indicates a unique, statistically rigorous parameter choice, associated with a well-defined topological feature. We then find that, if the ML condition is incompatible with the built-in parameter choice, network models turn out to be intrinsically ill defined or biased. To overcome this problem, we construct a class of safely unbiased models. We also propose an extension of these results that leads to the fascinating possibility to extract, only from topological data, the “hidden variables” underlying network organization, making them “no longer hidden.” We test our method on World Trade Web data, where we recover the empirical gross domestic product using only topological information.

  7. Analysis of conditional genetic effects and variance components in developmental genetics.

    PubMed

    Zhu, J

    1995-12-01

    A genetic model with additive-dominance effects and genotype x environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t-1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects.

  8. Analysis of Conditional Genetic Effects and Variance Components in Developmental Genetics

    PubMed Central

    Zhu, J.

    1995-01-01

    A genetic model with additive-dominance effects and genotype X environment interactions is presented for quantitative traits with time-dependent measures. The genetic model for phenotypic means at time t conditional on phenotypic means measured at previous time (t - 1) is defined. Statistical methods are proposed for analyzing conditional genetic effects and conditional genetic variance components. Conditional variances can be estimated by minimum norm quadratic unbiased estimation (MINQUE) method. An adjusted unbiased prediction (AUP) procedure is suggested for predicting conditional genetic effects. A worked example from cotton fruiting data is given for comparison of unconditional and conditional genetic variances and additive effects. PMID:8601500

  9. Markov state models from short non-equilibrium simulations—Analysis and correction of estimation bias

    NASA Astrophysics Data System (ADS)

    Nüske, Feliks; Wu, Hao; Prinz, Jan-Hendrik; Wehmeyer, Christoph; Clementi, Cecilia; Noé, Frank

    2017-03-01

    Many state-of-the-art methods for the thermodynamic and kinetic characterization of large and complex biomolecular systems by simulation rely on ensemble approaches, where data from large numbers of relatively short trajectories are integrated. In this context, Markov state models (MSMs) are extremely popular because they can be used to compute stationary quantities and long-time kinetics from ensembles of short simulations, provided that these short simulations are in "local equilibrium" within the MSM states. However, over the last 15 years since the inception of MSMs, it has been controversially discussed and not yet been answered how deviations from local equilibrium can be detected, whether these deviations induce a practical bias in MSM estimation, and how to correct for them. In this paper, we address these issues: We systematically analyze the estimation of MSMs from short non-equilibrium simulations, and we provide an expression for the error between unbiased transition probabilities and the expected estimate from many short simulations. We show that the unbiased MSM estimate can be obtained even from relatively short non-equilibrium simulations in the limit of long lag times and good discretization. Further, we exploit observable operator model (OOM) theory to derive an unbiased estimator for the MSM transition matrix that corrects for the effect of starting out of equilibrium, even when short lag times are used. Finally, we show how the OOM framework can be used to estimate the exact eigenvalues or relaxation time scales of the system without estimating an MSM transition matrix, which allows us to practically assess the discretization quality of the MSM. Applications to model systems and molecular dynamics simulation data of alanine dipeptide are included for illustration. The improved MSM estimator is implemented in PyEMMA of version 2.3.

  10. Evaluation of Bayesian estimation of a hidden continuous-time Markov chain model with application to threshold violation in water-quality indicators

    USGS Publications Warehouse

    Deviney, Frank A.; Rice, Karen; Brown, Donald E.

    2012-01-01

    Natural resource managers require information concerning  the frequency, duration, and long-term probability of occurrence of water-quality indicator (WQI) violations of defined thresholds. The timing of these threshold crossings often is hidden from the observer, who is restricted to relatively infrequent observations. Here, a model for the hidden process is linked with a model for the observations, and the parameters describing duration, return period, and long-term probability of occurrence are estimated using Bayesian methods. A simulation experiment is performed to evaluate the approach under scenarios based on the equivalent of a total monitoring period of 5-30 years and an observation frequency of 1-50 observations per year. Given constant threshold crossing rate, accuracy and precision of parameter estimates increased with longer total monitoring period and more-frequent observations. Given fixed monitoring period and observation frequency, accuracy and precision of parameter estimates increased with longer times between threshold crossings. For most cases where the long-term probability of being in violation is greater than 0.10, it was determined that at least 600 observations are needed to achieve precise estimates.  An application of the approach is presented using 22 years of quasi-weekly observations of acid-neutralizing capacity from Deep Run, a stream in Shenandoah National Park, Virginia. The time series also was sub-sampled to simulate monthly and semi-monthly sampling protocols. Estimates of the long-term probability of violation were unbiased despite sampling frequency; however, the expected duration and return period were over-estimated using the sub-sampled time series with respect to the full quasi-weekly time series.

  11. Controls on the physical properties of gas-hydrate-bearing sediments because of the interaction between gas hydrate and porous media

    USGS Publications Warehouse

    Lee, Myung W.; Collett, Timothy S.

    2005-01-01

    Physical properties of gas-hydrate-bearing sediments depend on the pore-scale interaction between gas hydrate and porous media as well as the amount of gas hydrate present. Well log measurements such as proton nuclear magnetic resonance (NMR) relaxation and electromagnetic propagation tool (EPT) techniques depend primarily on the bulk volume of gas hydrate in the pore space irrespective of the pore-scale interaction. However, elastic velocities or permeability depend on how gas hydrate is distributed in the pore space as well as the amount of gas hydrate. Gas-hydrate saturations estimated from NMR and EPT measurements are free of adjustable parameters; thus, the estimations are unbiased estimates of gas hydrate if the measurement is accurate. However, the amount of gas hydrate estimated from elastic velocities or electrical resistivities depends on many adjustable parameters and models related to the interaction of gas hydrate and porous media, so these estimates are model dependent and biased. NMR, EPT, elastic-wave velocity, electrical resistivity, and permeability measurements acquired in the Mallik 5L-38 well in the Mackenzie Delta, Canada, show that all of the well log evaluation techniques considered provide comparable gas-hydrate saturations in clean (low shale content) sandstone intervals with high gas-hydrate saturations. However, in shaly intervals, estimates from log measurement depending on the pore-scale interaction between gas hydrate and host sediments are higher than those estimates from measurements depending on the bulk volume of gas hydrate.

  12. Using Audit Information to Adjust Parameter Estimates for Data Errors in Clinical Trials

    PubMed Central

    Shepherd, Bryan E.; Shaw, Pamela A.; Dodd, Lori E.

    2013-01-01

    Background Audits are often performed to assess the quality of clinical trial data, but beyond detecting fraud or sloppiness, the audit data is generally ignored. In earlier work using data from a non-randomized study, Shepherd and Yu (2011) developed statistical methods to incorporate audit results into study estimates, and demonstrated that audit data could be used to eliminate bias. Purpose In this manuscript we examine the usefulness of audit-based error-correction methods in clinical trial settings where a continuous outcome is of primary interest. Methods We demonstrate the bias of multiple linear regression estimates in general settings with an outcome that may have errors and a set of covariates for which some may have errors and others, including treatment assignment, are recorded correctly for all subjects. We study this bias under different assumptions including independence between treatment assignment, covariates, and data errors (conceivable in a double-blinded randomized trial) and independence between treatment assignment and covariates but not data errors (possible in an unblinded randomized trial). We review moment-based estimators to incorporate the audit data and propose new multiple imputation estimators. The performance of estimators is studied in simulations. Results When treatment is randomized and unrelated to data errors, estimates of the treatment effect using the original error-prone data (i.e., ignoring the audit results) are unbiased. In this setting, both moment and multiple imputation estimators incorporating audit data are more variable than standard analyses using the original data. In contrast, in settings where treatment is randomized but correlated with data errors and in settings where treatment is not randomized, standard treatment effect estimates will be biased. And in all settings, parameter estimates for the original, error-prone covariates will be biased. Treatment and covariate effect estimates can be corrected by incorporating audit data using either the multiple imputation or moment-based approaches. Bias, precision, and coverage of confidence intervals improve as the audit size increases. Limitations The extent of bias and the performance of methods depend on the extent and nature of the error as well as the size of the audit. This work only considers methods for the linear model. Settings much different than those considered here need further study. Conclusions In randomized trials with continuous outcomes and treatment assignment independent of data errors, standard analyses of treatment effects will be unbiased and are recommended. However, if treatment assignment is correlated with data errors or other covariates, naive analyses may be biased. In these settings, and when covariate effects are of interest, approaches for incorporating audit results should be considered. PMID:22848072

  13. Semiparametric Bayesian analysis of gene-environment interactions with error in measurement of environmental covariates and missing genetic data.

    PubMed

    Lobach, Iryna; Mallick, Bani; Carroll, Raymond J

    2011-01-01

    Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.

  14. [Analytic methods for seed models with genotype x environment interactions].

    PubMed

    Zhu, J

    1996-01-01

    Genetic models with genotype effect (G) and genotype x environment interaction effect (GE) are proposed for analyzing generation means of seed quantitative traits in crops. The total genetic effect (G) is partitioned into seed direct genetic effect (G0), cytoplasm genetic of effect (C), and maternal plant genetic effect (Gm). Seed direct genetic effect (G0) can be further partitioned into direct additive (A) and direct dominance (D) genetic components. Maternal genetic effect (Gm) can also be partitioned into maternal additive (Am) and maternal dominance (Dm) genetic components. The total genotype x environment interaction effect (GE) can also be partitioned into direct genetic by environment interaction effect (G0E), cytoplasm genetic by environment interaction effect (CE), and maternal genetic by environment interaction effect (GmE). G0E can be partitioned into direct additive by environment interaction (AE) and direct dominance by environment interaction (DE) genetic components. GmE can also be partitioned into maternal additive by environment interaction (AmE) and maternal dominance by environment interaction (DmE) genetic components. Partitions of genetic components are listed for parent, F1, F2 and backcrosses. A set of parents, their reciprocal F1 and F2 seeds is applicable for efficient analysis of seed quantitative traits. MINQUE(0/1) method can be used for estimating variance and covariance components. Unbiased estimation for covariance components between two traits can also be obtained by the MINQUE(0/1) method. Random genetic effects in seed models are predictable by the Adjusted Unbiased Prediction (AUP) approach with MINQUE(0/1) method. The jackknife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects, which can be further used in a t-test for parameter. Unbiasedness and efficiency for estimating variance components and predicting genetic effects are tested by Monte Carlo simulations.

  15. Regression sampling: some results for resource managers and researchers

    Treesearch

    William G. O' Regan; Robert W. Boyd

    1974-01-01

    Regression sampling is widely used in natural resources management and research to estimate quantities of resources per unit area. This note brings together results found in the statistical literature in the application of this sampling technique. Conditional and unconditional estimators are listed and for each estimator, exact variances and unbiased estimators for the...

  16. Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.

    ERIC Educational Resources Information Center

    Kromrey, Jeffrey D.; Hines, Constance V.

    1995-01-01

    The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…

  17. Estimating temporary emigration and breeding proportions using capture-recapture data with Pollock's robust design

    USGS Publications Warehouse

    Kendall, W.L.; Nichols, J.D.; Hines, J.E.

    1997-01-01

    Statistical inference for capture-recapture studies of open animal populations typically relies on the assumption that all emigration from the studied population is permanent. However, there are many instances in which this assumption is unlikely to be met. We define two general models for the process of temporary emigration, completely random and Markovian. We then consider effects of these two types of temporary emigration on Jolly-Seber (Seber 1982) estimators and on estimators arising from the full-likelihood approach of Kendall et al. (1995) to robust design data. Capture-recapture data arising from Pollock's (1982) robust design provide the basis for obtaining unbiased estimates of demographic parameters in the presence of temporary emigration and for estimating the probability of temporary emigration. We present a likelihood-based approach to dealing with temporary emigration that permits estimation under different models of temporary emigration and yields tests for completely random and Markovian emigration. In addition, we use the relationship between capture probability estimates based on closed and open models under completely random temporary emigration to derive three ad hoc estimators for the probability of temporary emigration, two of which should be especially useful in situations where capture probabilities are heterogeneous among individual animals. Ad hoc and full-likelihood estimators are illustrated for small mammal capture-recapture data sets. We believe that these models and estimators will be useful for testing hypotheses about the process of temporary emigration, for estimating demographic parameters in the presence of temporary emigration, and for estimating probabilities of temporary emigration. These latter estimates are frequently of ecological interest as indicators of animal movement and, in some sampling situations, as direct estimates of breeding probabilities and proportions.

  18. Estimating the theoretical semivariogram from finite numbers of measurements

    USGS Publications Warehouse

    Zheng, Li; Silliman, Stephen E.

    2000-01-01

    We investigate from a theoretical basis the impacts of the number, location, and correlation among measurement points on the quality of an estimate of the semivariogram. The unbiased nature of the semivariogram estimator ŷ(r) is first established for a general random process Z(x). The variance of ŷZ(r) is then derived as a function of the sampling parameters (the number of measurements and their locations). In applying this function to the case of estimating the semivariograms of the transmissivity and the hydraulic head field, it is shown that the estimation error depends on the number of the data pairs, the correlation among the data pairs (which, in turn, are determined by the form of the underlying semivariogram γ(r)), the relative locations of the data pairs, and the separation distance at which the semivariogram is to be estimated. Thus design of an optimal sampling program for semivariogram estimation should include consideration of each of these factors. Further, the function derived for the variance of ŷZ(r) is useful in determining the reliability of a semivariogram developed from a previously established sampling design.

  19. Estimating unbiased economies of scale of HIV prevention projects: a case study of Avahan.

    PubMed

    Lépine, Aurélia; Vassall, Anna; Chandrashekar, Sudha; Blanc, Elodie; Le Nestour, Alexis

    2015-04-01

    Governments and donors are investing considerable resources on HIV prevention in order to scale up these services rapidly. Given the current economic climate, providers of HIV prevention services increasingly need to demonstrate that these investments offer good 'value for money'. One of the primary routes to achieve efficiency is to take advantage of economies of scale (a reduction in the average cost of a health service as provision scales-up), yet empirical evidence on economies of scale is scarce. Methodologically, the estimation of economies of scale is hampered by several statistical issues preventing causal inference and thus making the estimation of economies of scale complex. In order to estimate unbiased economies of scale when scaling up HIV prevention services, we apply our analysis to one of the few HIV prevention programmes globally delivered at a large scale: the Indian Avahan initiative. We costed the project by collecting data from the 138 Avahan NGOs and the supporting partners in the first four years of its scale-up, between 2004 and 2007. We develop a parsimonious empirical model and apply a system Generalized Method of Moments (GMM) and fixed-effects Instrumental Variable (IV) estimators to estimate unbiased economies of scale. At the programme level, we find that, after controlling for the endogeneity of scale, the scale-up of Avahan has generated high economies of scale. Our findings suggest that average cost reductions per person reached are achievable when scaling-up HIV prevention in low and middle income countries. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES.

    PubMed

    Zhou, Xiang

    2017-12-01

    Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods-the renowned Haseman-Elston (HE) regression and the recent LD score regression (LDSC)-into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal z -scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.

  1. Hybrid pathwise sensitivity methods for discrete stochastic models of chemical reaction systems.

    PubMed

    Wolf, Elizabeth Skubak; Anderson, David F

    2015-01-21

    Stochastic models are often used to help understand the behavior of intracellular biochemical processes. The most common such models are continuous time Markov chains (CTMCs). Parametric sensitivities, which are derivatives of expectations of model output quantities with respect to model parameters, are useful in this setting for a variety of applications. In this paper, we introduce a class of hybrid pathwise differentiation methods for the numerical estimation of parametric sensitivities. The new hybrid methods combine elements from the three main classes of procedures for sensitivity estimation and have a number of desirable qualities. First, the new methods are unbiased for a broad class of problems. Second, the methods are applicable to nearly any physically relevant biochemical CTMC model. Third, and as we demonstrate on several numerical examples, the new methods are quite efficient, particularly if one wishes to estimate the full gradient of parametric sensitivities. The methods are rather intuitive and utilize the multilevel Monte Carlo philosophy of splitting an expectation into separate parts and handling each in an efficient manner.

  2. Statistical properties of four effect-size measures for mediation models.

    PubMed

    Miočević, Milica; O'Rourke, Holly P; MacKinnon, David P; Brown, Hendricks C

    2018-02-01

    This project examined the performance of classical and Bayesian estimators of four effect size measures for the indirect effect in a single-mediator model and a two-mediator model. Compared to the proportion and ratio mediation effect sizes, standardized mediation effect-size measures were relatively unbiased and efficient in the single-mediator model and the two-mediator model. Percentile and bias-corrected bootstrap interval estimates of ab/s Y , and ab(s X )/s Y in the single-mediator model outperformed interval estimates of the proportion and ratio effect sizes in terms of power, Type I error rate, coverage, imbalance, and interval width. For the two-mediator model, standardized effect-size measures were superior to the proportion and ratio effect-size measures. Furthermore, it was found that Bayesian point and interval summaries of posterior distributions of standardized effect-size measures reduced excessive relative bias for certain parameter combinations. The standardized effect-size measures are the best effect-size measures for quantifying mediated effects.

  3. Evaluation of the procedure 1A component of the 1980 US/Canada wheat and barley exploratory experiment

    NASA Technical Reports Server (NTRS)

    Chapman, G. M. (Principal Investigator); Carnes, J. G.

    1981-01-01

    Several techniques which use clusters generated by a new clustering algorithm, CLASSY, are proposed as alternatives to random sampling to obtain greater precision in crop proportion estimation: (1) Proportional Allocation/relative count estimator (PA/RCE) uses proportional allocation of dots to clusters on the basis of cluster size and a relative count cluster level estimate; (2) Proportional Allocation/Bayes Estimator (PA/BE) uses proportional allocation of dots to clusters and a Bayesian cluster-level estimate; and (3) Bayes Sequential Allocation/Bayesian Estimator (BSA/BE) uses sequential allocation of dots to clusters and a Bayesian cluster level estimate. Clustering in an effective method in making proportion estimates. It is estimated that, to obtain the same precision with random sampling as obtained by the proportional sampling of 50 dots with an unbiased estimator, samples of 85 or 166 would need to be taken if dot sets with AI labels (integrated procedure) or ground truth labels, respectively were input. Dot reallocation provides dot sets that are unbiased. It is recommended that these proportion estimation techniques are maintained, particularly the PA/BE because it provides the greatest precision.

  4. Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling

    NASA Astrophysics Data System (ADS)

    Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

    2017-04-01

    Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.

  5. Effects of tag loss on direct estimates of population growth rate

    USGS Publications Warehouse

    Rotella, J.J.; Hines, J.E.

    2005-01-01

    The temporal symmetry approach of R. Pradel can be used with capture-recapture data to produce retrospective estimates of a population's growth rate, lambda(i), and the relative contributions to lambda(i) from different components of the population. Direct estimation of lambda(i) provides an alternative to using population projection matrices to estimate asymptotic lambda and is seeing increased use. However, the robustness of direct estimates of lambda(1) to violations of several key assumptions has not yet been investigated. Here, we consider tag loss as a possible source of bias for scenarios in which the rate of tag loss is (1) the same for all marked animals in the population and (2) a function of tag age. We computed analytic approximations of the expected values for each of the parameter estimators involved in direct estimation and used those values to calculate bias and precision for each parameter estimator. Estimates of lambda(i) were robust to homogeneous rates of tag loss. When tag loss rates varied by tag age, bias occurred for some of the sampling situations evaluated, especially those with low capture probability, a high rate of tag loss, or both. For situations with low rates of tag loss and high capture probability, bias was low and often negligible. Estimates of contributions of demographic components to lambda(i) were not robust to tag loss. Tag loss reduced the precision of all estimates because tag loss results in fewer marked animals remaining available for estimation. Clearly tag loss should be prevented if possible, and should be considered in analyses of lambda(i), but tag loss does not necessarily preclude unbiased estimation of lambda(i).

  6. Forest inventory and stratified estimation: a cautionary note

    Treesearch

    John Coulston

    2008-01-01

    The Forest Inventory and Analysis (FIA) Program uses stratified estimation techniques to produce estimates of forest attributes. Stratification must be unbiased and stratification procedures should be examined to identify any potential bias. This note explains simple techniques for identifying potential bias, discriminating between sample bias and stratification bias,...

  7. Losing the rose tinted glasses: neural substrates of unbiased belief updating in depression

    PubMed Central

    Garrett, Neil; Sharot, Tali; Faulkner, Paul; Korn, Christoph W.; Roiser, Jonathan P.; Dolan, Raymond J.

    2014-01-01

    Recent evidence suggests that a state of good mental health is associated with biased processing of information that supports a positively skewed view of the future. Depression, on the other hand, is associated with unbiased processing of such information. Here, we use brain imaging in conjunction with a belief update task administered to clinically depressed patients and healthy controls to characterize brain activity that supports unbiased belief updating in clinically depressed individuals. Our results reveal that unbiased belief updating in depression is mediated by strong neural coding of estimation errors in response to both good news (in left inferior frontal gyrus and bilateral superior frontal gyrus) and bad news (in right inferior parietal lobule and right inferior frontal gyrus) regarding the future. In contrast, intact mental health was linked to a relatively attenuated neural coding of bad news about the future. These findings identify a neural substrate mediating the breakdown of biased updating in major depression disorder, which may be essential for mental health. PMID:25221492

  8. An evaluation of flow-stratified sampling for estimating suspended sediment loads

    Treesearch

    Robert B. Thomas; Jack Lewis

    1995-01-01

    Abstract - Flow-stratified sampling is a new method for sampling water quality constituents such as suspended sediment to estimate loads. As with selection-at-list-time (SALT) and time-stratified sampling, flow-stratified sampling is a statistical method requiring random sampling, and yielding unbiased estimates of load and variance. It can be used to estimate event...

  9. Combined Radar-Radiometer Surface Soil Moisture and Roughness Estimation

    NASA Technical Reports Server (NTRS)

    Akbar, Ruzbeh; Cosh, Michael H.; O'Neill, Peggy E.; Entekhabi, Dara; Moghaddam, Mahta

    2017-01-01

    A robust physics-based combined radar-radiometer, or Active-Passive, surface soil moisture and roughness estimation methodology is presented. Soil moisture and roughness retrieval is performed via optimization, i.e., minimization, of a joint objective function which constrains similar resolution radar and radiometer observations simultaneously. A data-driven and noise-dependent regularization term has also been developed to automatically regularize and balance corresponding radar and radiometer contributions to achieve optimal soil moisture retrievals. It is shown that in order to compensate for measurement and observation noise, as well as forward model inaccuracies, in combined radar-radiometer estimation surface roughness can be considered a free parameter. Extensive Monte-Carlo numerical simulations and assessment using field data have been performed to both evaluate the algorithms performance and to demonstrate soil moisture estimation. Unbiased root mean squared errors (RMSE) range from 0.18 to 0.03 cm3cm3 for two different land cover types of corn and soybean. In summary, in the context of soil moisture retrieval, the importance of consistent forward emission and scattering development is discussed and presented.

  10. The estimation of lower refractivity uncertainty from radar sea clutter using the Bayesian—MCMC method

    NASA Astrophysics Data System (ADS)

    Sheng, Zheng

    2013-02-01

    The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem. This paper deals with the RFC problem in a Bayesian framework. It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique, which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework. In contrast to the global optimization algorithm, the Bayesian—MCMC can obtain not only the approximate solutions, but also the probability distributions of the solutions, that is, uncertainty analyses of solutions. The Bayesian—MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar sea-clutter data. Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter. The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions.

  11. Setting up a proper power spectral density (PSD) and autocorrelation analysis for material and process characterization

    NASA Astrophysics Data System (ADS)

    Rutigliani, Vito; Lorusso, Gian Francesco; De Simone, Danilo; Lazzarino, Frederic; Rispens, Gijsbert; Papavieros, George; Gogolides, Evangelos; Constantoudis, Vassilios; Mack, Chris A.

    2018-03-01

    Power spectral density (PSD) analysis is playing more and more a critical role in the understanding of line-edge roughness (LER) and linewidth roughness (LWR) in a variety of applications across the industry. It is an essential step to get an unbiased LWR estimate, as well as an extremely useful tool for process and material characterization. However, PSD estimate can be affected by both random to systematic artifacts caused by image acquisition and measurement settings, which could irremediably alter its information content. In this paper, we report on the impact of various setting parameters (smoothing image processing filters, pixel size, and SEM noise levels) on the PSD estimate. We discuss also the use of PSD analysis tool in a variety of cases. Looking beyond the basic roughness estimate, we use PSD and autocorrelation analysis to characterize resist blur[1], as well as low and high frequency roughness contents and we apply this technique to guide the EUV material stack selection. Our results clearly indicate that, if properly used, PSD methodology is a very sensitive tool to investigate material and process variations

  12. Optimal designs based on the maximum quasi-likelihood estimator

    PubMed Central

    Shen, Gang; Hyun, Seung Won; Wong, Weng Kee

    2016-01-01

    We use optimal design theory and construct locally optimal designs based on the maximum quasi-likelihood estimator (MqLE), which is derived under less stringent conditions than those required for the MLE method. We show that the proposed locally optimal designs are asymptotically as efficient as those based on the MLE when the error distribution is from an exponential family, and they perform just as well or better than optimal designs based on any other asymptotically linear unbiased estimators such as the least square estimator (LSE). In addition, we show current algorithms for finding optimal designs can be directly used to find optimal designs based on the MqLE. As an illustrative application, we construct a variety of locally optimal designs based on the MqLE for the 4-parameter logistic (4PL) model and study their robustness properties to misspecifications in the model using asymptotic relative efficiency. The results suggest that optimal designs based on the MqLE can be easily generated and they are quite robust to mis-specification in the probability distribution of the responses. PMID:28163359

  13. Combined Radar-Radiometer Surface Soil Moisture and Roughness Estimation.

    PubMed

    Akbar, Ruzbeh; Cosh, Michael H; O'Neill, Peggy E; Entekhabi, Dara; Moghaddam, Mahta

    2017-07-01

    A robust physics-based combined radar-radiometer, or Active-Passive, surface soil moisture and roughness estimation methodology is presented. Soil moisture and roughness retrieval is performed via optimization, i.e., minimization, of a joint objective function which constrains similar resolution radar and radiometer observations simultaneously. A data-driven and noise-dependent regularization term has also been developed to automatically regularize and balance corresponding radar and radiometer contributions to achieve optimal soil moisture retrievals. It is shown that in order to compensate for measurement and observation noise, as well as forward model inaccuracies, in combined radar-radiometer estimation surface roughness can be considered a free parameter. Extensive Monte-Carlo numerical simulations and assessment using field data have been performed to both evaluate the algorithm's performance and to demonstrate soil moisture estimation. Unbiased root mean squared errors (RMSE) range from 0.18 to 0.03 cm3/cm3 for two different land cover types of corn and soybean. In summary, in the context of soil moisture retrieval, the importance of consistent forward emission and scattering development is discussed and presented.

  14. Data-driven nonlinear optimisation of a simple air pollution dispersion model generating high resolution spatiotemporal exposure

    NASA Astrophysics Data System (ADS)

    Yuval; Bekhor, Shlomo; Broday, David M.

    2013-11-01

    Spatially detailed estimation of exposure to air pollutants in the urban environment is needed for many air pollution epidemiological studies. To benefit studies of acute effects of air pollution such exposure maps are required at high temporal resolution. This study introduces nonlinear optimisation framework that produces high resolution spatiotemporal exposure maps. An extensive traffic model output, serving as proxy for traffic emissions, is fitted via a nonlinear model embodying basic dispersion properties, to high temporal resolution routine observations of traffic-related air pollutant. An optimisation problem is formulated and solved at each time point to recover the unknown model parameters. These parameters are then used to produce a detailed concentration map of the pollutant for the whole area covered by the traffic model. Repeating the process for multiple time points results in the spatiotemporal concentration field. The exposure at any location and for any span of time can then be computed by temporal integration of the concentration time series at selected receptor locations for the durations of desired periods. The methodology is demonstrated for NO2 exposure using the output of a traffic model for the greater Tel Aviv area, Israel, and the half-hourly monitoring and meteorological data from the local air quality network. A leave-one-out cross-validation resulted in simulated half-hourly concentrations that are almost unbiased compared to the observations, with a mean error (ME) of 5.2 ppb, normalised mean error (NME) of 32%, 78% of the simulated values are within a factor of two (FAC2) of the observations, and the coefficient of determination (R2) is 0.6. The whole study period integrated exposure estimations are also unbiased compared with their corresponding observations, with ME of 2.5 ppb, NME of 18%, FAC2 of 100% and R2 that equals 0.62.

  15. Large biases in regression-based constituent flux estimates: causes and diagnostic tools

    USGS Publications Warehouse

    Hirsch, Robert M.

    2014-01-01

    It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.

  16. Designing a Pediatric Study for an Antimalarial Drug by Using Information from Adults

    PubMed Central

    Jullien, Vincent; Samson, Adeline; Guedj, Jérémie; Kiechel, Jean-René; Zohar, Sarah; Comets, Emmanuelle

    2015-01-01

    The objectives of this study were to design a pharmacokinetic (PK) study by using information about adults and evaluate the robustness of the recommended design through a case study of mefloquine. PK data about adults and children were available from two different randomized studies of the treatment of malaria with the same artesunate-mefloquine combination regimen. A recommended design for pediatric studies of mefloquine was optimized on the basis of an extrapolated model built from adult data through the following approach. (i) An adult PK model was built, and parameters were estimated by using the stochastic approximation expectation-maximization algorithm. (ii) Pediatric PK parameters were then obtained by adding allometry and maturation to the adult model. (iii) A D-optimal design for children was obtained with PFIM by assuming the extrapolated design. Finally, the robustness of the recommended design was evaluated in terms of the relative bias and relative standard errors (RSE) of the parameters in a simulation study with four different models and was compared to the empirical design used for the pediatric study. Combining PK modeling, extrapolation, and design optimization led to a design for children with five sampling times. PK parameters were well estimated by this design with few RSE. Although the extrapolated model did not predict the observed mefloquine concentrations in children very accurately, it allowed precise and unbiased estimates across various model assumptions, contrary to the empirical design. Using information from adult studies combined with allometry and maturation can help provide robust designs for pediatric studies. PMID:26711749

  17. Calibrating SALT: a sampling scheme to improve estimates of suspended sediment yield

    Treesearch

    Robert B. Thomas

    1986-01-01

    Abstract - SALT (Selection At List Time) is a variable probability sampling scheme that provides unbiased estimates of suspended sediment yield and its variance. SALT performs better than standard schemes which are estimate variance. Sampling probabilities are based on a sediment rating function which promotes greater sampling intensity during periods of high...

  18. Minimum variance geographic sampling

    NASA Technical Reports Server (NTRS)

    Terrell, G. R. (Principal Investigator)

    1980-01-01

    Resource inventories require samples with geographical scatter, sometimes not as widely spaced as would be hoped. A simple model of correlation over distances is used to create a minimum variance unbiased estimate population means. The fitting procedure is illustrated from data used to estimate Missouri corn acreage.

  19. Point estimation following two-stage adaptive threshold enrichment clinical trials.

    PubMed

    Kimani, Peter K; Todd, Susan; Renfro, Lindsay A; Stallard, Nigel

    2018-05-31

    Recently, several study designs incorporating treatment effect assessment in biomarker-based subpopulations have been proposed. Most statistical methodologies for such designs focus on the control of type I error rate and power. In this paper, we have developed point estimators for clinical trials that use the two-stage adaptive enrichment threshold design. The design consists of two stages, where in stage 1, patients are recruited in the full population. Stage 1 outcome data are then used to perform interim analysis to decide whether the trial continues to stage 2 with the full population or a subpopulation. The subpopulation is defined based on one of the candidate threshold values of a numerical predictive biomarker. To estimate treatment effect in the selected subpopulation, we have derived unbiased estimators, shrinkage estimators, and estimators that estimate bias and subtract it from the naive estimate. We have recommended one of the unbiased estimators. However, since none of the estimators dominated in all simulation scenarios based on both bias and mean squared error, an alternative strategy would be to use a hybrid estimator where the estimator used depends on the subpopulation selected. This would require a simulation study of plausible scenarios before the trial. © 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  20. Missing continuous outcomes under covariate dependent missingness in cluster randomised trials

    PubMed Central

    Diaz-Ordaz, Karla; Bartlett, Jonathan W

    2016-01-01

    Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group. PMID:27177885

  1. Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.

    PubMed

    Hossain, Anower; Diaz-Ordaz, Karla; Bartlett, Jonathan W

    2017-06-01

    Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.

  2. Nonlinear unbiased minimum-variance filter for Mars entry autonomous navigation under large uncertainties and unknown measurement bias.

    PubMed

    Xiao, Mengli; Zhang, Yongbo; Fu, Huimin; Wang, Zhihua

    2018-05-01

    High-precision navigation algorithm is essential for the future Mars pinpoint landing mission. The unknown inputs caused by large uncertainties of atmospheric density and aerodynamic coefficients as well as unknown measurement biases may cause large estimation errors of conventional Kalman filters. This paper proposes a derivative-free version of nonlinear unbiased minimum variance filter for Mars entry navigation. This filter has been designed to solve this problem by estimating the state and unknown measurement biases simultaneously with derivative-free character, leading to a high-precision algorithm for the Mars entry navigation. IMU/radio beacons integrated navigation is introduced in the simulation, and the result shows that with or without radio blackout, our proposed filter could achieve an accurate state estimation, much better than the conventional unscented Kalman filter, showing the ability of high-precision Mars entry navigation algorithm. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.

  3. Unbiased estimation in seamless phase II/III trials with unequal treatment effect variances and hypothesis-driven selection rules.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2016-09-30

    Seamless phase II/III clinical trials offer an efficient way to select an experimental treatment and perform confirmatory analysis within a single trial. However, combining the data from both stages in the final analysis can induce bias into the estimates of treatment effects. Methods for bias adjustment developed thus far have made restrictive assumptions about the design and selection rules followed. In order to address these shortcomings, we apply recent methodological advances to derive the uniformly minimum variance conditionally unbiased estimator for two-stage seamless phase II/III trials. Our framework allows for the precision of the treatment arm estimates to take arbitrary values, can be utilised for all treatments that are taken forward to phase III and is applicable when the decision to select or drop treatment arms is driven by a multiplicity-adjusted hypothesis testing procedure. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  4. Overlap between treatment and control distributions as an effect size measure in experiments.

    PubMed

    Hedges, Larry V; Olkin, Ingram

    2016-03-01

    The proportion π of treatment group observations that exceed the control group mean has been proposed as an effect size measure for experiments that randomly assign independent units into 2 groups. We give the exact distribution of a simple estimator of π based on the standardized mean difference and use it to study the small sample bias of this estimator. We also give the minimum variance unbiased estimator of π under 2 models, one in which the variance of the mean difference is known and one in which the variance is unknown. We show how to use the relation between the standardized mean difference and the overlap measure to compute confidence intervals for π and show that these results can be used to obtain unbiased estimators, large sample variances, and confidence intervals for 3 related effect size measures based on the overlap. Finally, we show how the effect size π can be used in a meta-analysis. (c) 2016 APA, all rights reserved).

  5. Network topological analysis reveals the functional cohesiveness for the newly discovered links by Yeast 2 Hybrid approach

    NASA Astrophysics Data System (ADS)

    Ghiassian, Susan; Pevzner, Sam; Rolland, Thomas; Tassan, Murat; Barabasi, Albert Laszlo; Vidal, Mark; CCNR, Northeastern University Collaboration; Dana Farber Cancer Institute Collaboration

    2014-03-01

    Protein-protein interaction maps and interactomes are the blueprint of Network Medicine and systems biology and are being experimentally studied by different groups. Despite the wide usage of Literature Curated Interactome (LCI), these sources are biased towards different parameters such as highly studied proteins. Yeast two hybrid method is a high throughput experimental setup which screens proteins in an unbiased fashion. Current knowledge of protein interactions is far from complete. In fact the previous offered data from Y2H method (2005), is estimated to offer only 5% of all potential protein interactions. Currently this coverage has increased to 20% of what is known as reference HI In this work we study the topological properties of Y2H protein-protein interactions network with LCI and show although they both agree on some properties, LCI shows a clear unbiased nature of interaction selections. Most importantly, we assess the properties of PPI as it evolves with increasing the coverage. We show that, the newly discovered interactions tend to connect proteins that have been closer than average in the previous PPI release. reinforcing the modular structure of PPI. Furthermore, we show, some unseen effects on PPI (as opposed to LCI) can be explained by its incompleteness.

  6. Piecewise SALT sampling for estimating suspended sediment yields

    Treesearch

    Robert B. Thomas

    1989-01-01

    A probability sampling method called SALT (Selection At List Time) has been developed for collecting and summarizing data on delivery of suspended sediment in rivers. It is based on sampling and estimating yield using a suspended-sediment rating curve for high discharges and simple random sampling for low flows. The method gives unbiased estimates of total yield and...

  7. Approaches to Incorporating Late Pretests in Experiments: Evaluation of Two Early Mathematics and Self-Regulation Interventions

    ERIC Educational Resources Information Center

    Unlu, Fatih; Layzer, Carolyn; Clements, Douglas; Sarama, Julie; Cook, David

    2013-01-01

    Many educational Randomized Controlled Trials (RCTs) collect baseline versions of outcome measures (pretests) to be used in the estimation of impacts at posttest. Although pretest measures are not necessary for unbiased impact estimates in well executed experimental studies, using them increases the precision of impact estimates and reduces sample…

  8. Differential estimates of southern flying squirrel (Glaucomys volans) population structure based on capture method

    Treesearch

    Kevin S. Laves; Susan C. Loeb

    2005-01-01

    It is commonly assumed that population estimates derived from trapping small mammals are accurate and unbiased or that estimates derived from different capture methods are comparable. We captured southern flying squirrels (Glaucmrtys volam) using two methods to study their effect on red-cockaded woodpecker (Picoides bumah) reproductive success. Southern flying...

  9. How wet should be the reaction coordinate for ligand unbinding?

    PubMed

    Tiwary, Pratyush; Berne, B J

    2016-08-07

    We use a recently proposed method called Spectral Gap Optimization of Order Parameters (SGOOP) [P. Tiwary and B. J. Berne, Proc. Natl. Acad. Sci. U. S. A. 113, 2839 (2016)], to determine an optimal 1-dimensional reaction coordinate (RC) for the unbinding of a bucky-ball from a pocket in explicit water. This RC is estimated as a linear combination of the multiple available order parameters that collectively can be used to distinguish the various stable states relevant for unbinding. We pay special attention to determining and quantifying the degree to which water molecules should be included in the RC. Using SGOOP with under-sampled biased simulations, we predict that water plays a distinct role in the reaction coordinate for unbinding in the case when the ligand is sterically constrained to move along an axis of symmetry. This prediction is validated through extensive calculations of the unbinding times through metadynamics and by comparison through detailed balance with unbiased molecular dynamics estimate of the binding time. However when the steric constraint is removed, we find that the role of water in the reaction coordinate diminishes. Here instead SGOOP identifies a good one-dimensional RC involving various motional degrees of freedom.

  10. How wet should be the reaction coordinate for ligand unbinding?

    NASA Astrophysics Data System (ADS)

    Tiwary, Pratyush; Berne, B. J.

    2016-08-01

    We use a recently proposed method called Spectral Gap Optimization of Order Parameters (SGOOP) [P. Tiwary and B. J. Berne, Proc. Natl. Acad. Sci. U. S. A. 113, 2839 (2016)], to determine an optimal 1-dimensional reaction coordinate (RC) for the unbinding of a bucky-ball from a pocket in explicit water. This RC is estimated as a linear combination of the multiple available order parameters that collectively can be used to distinguish the various stable states relevant for unbinding. We pay special attention to determining and quantifying the degree to which water molecules should be included in the RC. Using SGOOP with under-sampled biased simulations, we predict that water plays a distinct role in the reaction coordinate for unbinding in the case when the ligand is sterically constrained to move along an axis of symmetry. This prediction is validated through extensive calculations of the unbinding times through metadynamics and by comparison through detailed balance with unbiased molecular dynamics estimate of the binding time. However when the steric constraint is removed, we find that the role of water in the reaction coordinate diminishes. Here instead SGOOP identifies a good one-dimensional RC involving various motional degrees of freedom.

  11. How wet should be the reaction coordinate for ligand unbinding?

    PubMed Central

    Tiwary, Pratyush; Berne, B. J.

    2016-01-01

    We use a recently proposed method called Spectral Gap Optimization of Order Parameters (SGOOP) [P. Tiwary and B. J. Berne, Proc. Natl. Acad. Sci. U. S. A. 113, 2839 (2016)], to determine an optimal 1-dimensional reaction coordinate (RC) for the unbinding of a bucky-ball from a pocket in explicit water. This RC is estimated as a linear combination of the multiple available order parameters that collectively can be used to distinguish the various stable states relevant for unbinding. We pay special attention to determining and quantifying the degree to which water molecules should be included in the RC. Using SGOOP with under-sampled biased simulations, we predict that water plays a distinct role in the reaction coordinate for unbinding in the case when the ligand is sterically constrained to move along an axis of symmetry. This prediction is validated through extensive calculations of the unbinding times through metadynamics and by comparison through detailed balance with unbiased molecular dynamics estimate of the binding time. However when the steric constraint is removed, we find that the role of water in the reaction coordinate diminishes. Here instead SGOOP identifies a good one-dimensional RC involving various motional degrees of freedom. PMID:27497545

  12. Reducing bias in survival under non-random temporary emigration

    USGS Publications Warehouse

    Peñaloza, Claudia L.; Kendall, William L.; Langtimm, Catherine Ann

    2014-01-01

    Despite intensive monitoring, temporary emigration from the sampling area can induce bias severe enough for managers to discard life-history parameter estimates toward the terminus of the times series (terminal bias). Under random temporary emigration unbiased parameters can be estimated with CJS models. However, unmodeled Markovian temporary emigration causes bias in parameter estimates and an unobservable state is required to model this type of emigration. The robust design is most flexible when modeling temporary emigration, and partial solutions to mitigate bias have been identified, nonetheless there are conditions were terminal bias prevails. Long-lived species with high adult survival and highly variable non-random temporary emigration present terminal bias in survival estimates, despite being modeled with the robust design and suggested constraints. Because this bias is due to uncertainty about the fate of individuals that are undetected toward the end of the time series, solutions should involve using additional information on survival status or location of these individuals at that time. Using simulation, we evaluated the performance of models that jointly analyze robust design data and an additional source of ancillary data (predictive covariate on temporary emigration, telemetry, dead recovery, or auxiliary resightings) in reducing terminal bias in survival estimates. The auxiliary resighting and predictive covariate models reduced terminal bias the most. Additional telemetry data was effective at reducing terminal bias only when individuals were tracked for a minimum of two years. High adult survival of long-lived species made the joint model with recovery data ineffective at reducing terminal bias because of small-sample bias. The naïve constraint model (last and penultimate temporary emigration parameters made equal), was the least efficient, though still able to reduce terminal bias when compared to an unconstrained model. Joint analysis of several sources of data improved parameter estimates and reduced terminal bias. Efforts to incorporate or acquire such data should be considered by researchers and wildlife managers, especially in the years leading up to status assessments of species of interest. Simulation modeling is a very cost effective method to explore the potential impacts of using different sources of data to produce high quality demographic data to inform management.

  13. Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression.

    PubMed

    Meng, Yilin; Roux, Benoît

    2015-08-11

    The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.

  14. Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression

    PubMed Central

    2015-01-01

    The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437

  15. Fast and unbiased estimator of the time-dependent Hurst exponent.

    PubMed

    Pianese, Augusto; Bianchi, Sergio; Palazzo, Anna Maria

    2018-03-01

    We combine two existing estimators of the local Hurst exponent to improve both the goodness of fit and the computational speed of the algorithm. An application with simulated time series is implemented, and a Monte Carlo simulation is performed to provide evidence of the improvement.

  16. Fast and unbiased estimator of the time-dependent Hurst exponent

    NASA Astrophysics Data System (ADS)

    Pianese, Augusto; Bianchi, Sergio; Palazzo, Anna Maria

    2018-03-01

    We combine two existing estimators of the local Hurst exponent to improve both the goodness of fit and the computational speed of the algorithm. An application with simulated time series is implemented, and a Monte Carlo simulation is performed to provide evidence of the improvement.

  17. Empirical best linear unbiased prediction method for small areas with restricted maximum likelihood and bootstrap procedure to estimate the average of household expenditure per capita in Banjar Regency

    NASA Astrophysics Data System (ADS)

    Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho

    2017-03-01

    So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.

  18. A new peak detection algorithm for MALDI mass spectrometry data based on a modified Asymmetric Pseudo-Voigt model.

    PubMed

    Wijetunge, Chalini D; Saeed, Isaam; Boughton, Berin A; Roessner, Ute; Halgamuge, Saman K

    2015-01-01

    Mass Spectrometry (MS) is a ubiquitous analytical tool in biological research and is used to measure the mass-to-charge ratio of bio-molecules. Peak detection is the essential first step in MS data analysis. Precise estimation of peak parameters such as peak summit location and peak area are critical to identify underlying bio-molecules and to estimate their abundances accurately. We propose a new method to detect and quantify peaks in mass spectra. It uses dual-tree complex wavelet transformation along with Stein's unbiased risk estimator for spectra smoothing. Then, a new method, based on the modified Asymmetric Pseudo-Voigt (mAPV) model and hierarchical particle swarm optimization, is used for peak parameter estimation. Using simulated data, we demonstrated the benefit of using the mAPV model over Gaussian, Lorentz and Bi-Gaussian functions for MS peak modelling. The proposed mAPV model achieved the best fitting accuracy for asymmetric peaks, with lower percentage errors in peak summit location estimation, which were 0.17% to 4.46% less than that of the other models. It also outperformed the other models in peak area estimation, delivering lower percentage errors, which were about 0.7% less than its closest competitor - the Bi-Gaussian model. In addition, using data generated from a MALDI-TOF computer model, we showed that the proposed overall algorithm outperformed the existing methods mainly in terms of sensitivity. It achieved a sensitivity of 85%, compared to 77% and 71% of the two benchmark algorithms, continuous wavelet transformation based method and Cromwell respectively. The proposed algorithm is particularly useful for peak detection and parameter estimation in MS data with overlapping peak distributions and asymmetric peaks. The algorithm is implemented using MATLAB and the source code is freely available at http://mapv.sourceforge.net.

  19. A new peak detection algorithm for MALDI mass spectrometry data based on a modified Asymmetric Pseudo-Voigt model

    PubMed Central

    2015-01-01

    Background Mass Spectrometry (MS) is a ubiquitous analytical tool in biological research and is used to measure the mass-to-charge ratio of bio-molecules. Peak detection is the essential first step in MS data analysis. Precise estimation of peak parameters such as peak summit location and peak area are critical to identify underlying bio-molecules and to estimate their abundances accurately. We propose a new method to detect and quantify peaks in mass spectra. It uses dual-tree complex wavelet transformation along with Stein's unbiased risk estimator for spectra smoothing. Then, a new method, based on the modified Asymmetric Pseudo-Voigt (mAPV) model and hierarchical particle swarm optimization, is used for peak parameter estimation. Results Using simulated data, we demonstrated the benefit of using the mAPV model over Gaussian, Lorentz and Bi-Gaussian functions for MS peak modelling. The proposed mAPV model achieved the best fitting accuracy for asymmetric peaks, with lower percentage errors in peak summit location estimation, which were 0.17% to 4.46% less than that of the other models. It also outperformed the other models in peak area estimation, delivering lower percentage errors, which were about 0.7% less than its closest competitor - the Bi-Gaussian model. In addition, using data generated from a MALDI-TOF computer model, we showed that the proposed overall algorithm outperformed the existing methods mainly in terms of sensitivity. It achieved a sensitivity of 85%, compared to 77% and 71% of the two benchmark algorithms, continuous wavelet transformation based method and Cromwell respectively. Conclusions The proposed algorithm is particularly useful for peak detection and parameter estimation in MS data with overlapping peak distributions and asymmetric peaks. The algorithm is implemented using MATLAB and the source code is freely available at http://mapv.sourceforge.net. PMID:26680279

  20. Unbiased survival estimates and evidence for skipped breeding opportunities in females

    USGS Publications Warehouse

    Muths, Erin L.; Scherer, Rick D.; Lambert, Brad A.

    2010-01-01

    5. Establishing the occurrence of temporary emigration not only reduces bias in estimates of survival probabilities but also provides information about expected breeding attempts by females, a critical element in understanding the ecology of an organism and the impacts of outside stressors and conservation actions.

  1. Empirically Driven Variable Selection for the Estimation of Causal Effects with Observational Data

    ERIC Educational Resources Information Center

    Keller, Bryan; Chen, Jianshen

    2016-01-01

    Observational studies are common in educational research, where subjects self-select or are otherwise non-randomly assigned to different interventions (e.g., educational programs, grade retention, special education). Unbiased estimation of a causal effect with observational data depends crucially on the assumption of ignorability, which specifies…

  2. Assessing Methods for Generalizing Experimental Impact Estimates to Target Populations

    ERIC Educational Resources Information Center

    Kern, Holger L.; Stuart, Elizabeth A.; Hill, Jennifer; Green, Donald P.

    2016-01-01

    Randomized experiments are considered the gold standard for causal inference because they can provide unbiased estimates of treatment effects for the experimental participants. However, researchers and policymakers are often interested in using a specific experiment to inform decisions about other target populations. In education research,…

  3. A comparison of selection at list time and time-stratified sampling for estimating suspended sediment loads

    Treesearch

    Robert B. Thomas; Jack Lewis

    1993-01-01

    Time-stratified sampling of sediment for estimating suspended load is introduced and compared to selection at list time (SALT) sampling. Both methods provide unbiased estimates of load and variance. The magnitude of the variance of the two methods is compared using five storm populations of suspended sediment flux derived from turbidity data. Under like conditions,...

  4. Using small area estimation and Lidar-derived variables for multivariate prediction of forest attributes

    Treesearch

    F. Mauro; Vicente Monleon; H. Temesgen

    2015-01-01

    Small area estimation (SAE) techniques have been successfully applied in forest inventories to provide reliable estimates for domains where the sample size is small (i.e. small areas). Previous studies have explored the use of either Area Level or Unit Level Empirical Best Linear Unbiased Predictors (EBLUPs) in a univariate framework, modeling each variable of interest...

  5. Regression calibration for models with two predictor variables measured with error and their interaction, using instrumental variables and longitudinal data.

    PubMed

    Strand, Matthew; Sillau, Stefan; Grunwald, Gary K; Rabinovitch, Nathan

    2014-02-10

    Regression calibration provides a way to obtain unbiased estimators of fixed effects in regression models when one or more predictors are measured with error. Recent development of measurement error methods has focused on models that include interaction terms between measured-with-error predictors, and separately, methods for estimation in models that account for correlated data. In this work, we derive explicit and novel forms of regression calibration estimators and associated asymptotic variances for longitudinal models that include interaction terms, when data from instrumental and unbiased surrogate variables are available but not the actual predictors of interest. The longitudinal data are fit using linear mixed models that contain random intercepts and account for serial correlation and unequally spaced observations. The motivating application involves a longitudinal study of exposure to two pollutants (predictors) - outdoor fine particulate matter and cigarette smoke - and their association in interactive form with levels of a biomarker of inflammation, leukotriene E4 (LTE 4 , outcome) in asthmatic children. Because the exposure concentrations could not be directly observed, we used measurements from a fixed outdoor monitor and urinary cotinine concentrations as instrumental variables, and we used concentrations of fine ambient particulate matter and cigarette smoke measured with error by personal monitors as unbiased surrogate variables. We applied the derived regression calibration methods to estimate coefficients of the unobserved predictors and their interaction, allowing for direct comparison of toxicity of the different pollutants. We used simulations to verify accuracy of inferential methods based on asymptotic theory. Copyright © 2013 John Wiley & Sons, Ltd.

  6. A New Model for Acquiescence at the Interface of Psychometrics and Cognitive Psychology.

    PubMed

    Plieninger, Hansjörg; Heck, Daniel W

    2018-05-29

    When measuring psychological traits, one has to consider that respondents often show content-unrelated response behavior in answering questionnaires. To disentangle the target trait and two such response styles, extreme responding and midpoint responding, Böckenholt ( 2012a ) developed an item response model based on a latent processing tree structure. We propose a theoretically motivated extension of this model to also measure acquiescence, the tendency to agree with both regular and reversed items. Substantively, our approach builds on multinomial processing tree (MPT) models that are used in cognitive psychology to disentangle qualitatively distinct processes. Accordingly, the new model for response styles assumes a mixture distribution of affirmative responses, which are either determined by the underlying target trait or by acquiescence. In order to estimate the model parameters, we rely on Bayesian hierarchical estimation of MPT models. In simulations, we show that the model provides unbiased estimates of response styles and the target trait, and we compare the new model and Böckenholt's model in a recovery study. An empirical example from personality psychology is used for illustrative purposes.

  7. Hybrid pathwise sensitivity methods for discrete stochastic models of chemical reaction systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Elizabeth Skubak, E-mail: ewolf@saintmarys.edu; Anderson, David F., E-mail: anderson@math.wisc.edu

    2015-01-21

    Stochastic models are often used to help understand the behavior of intracellular biochemical processes. The most common such models are continuous time Markov chains (CTMCs). Parametric sensitivities, which are derivatives of expectations of model output quantities with respect to model parameters, are useful in this setting for a variety of applications. In this paper, we introduce a class of hybrid pathwise differentiation methods for the numerical estimation of parametric sensitivities. The new hybrid methods combine elements from the three main classes of procedures for sensitivity estimation and have a number of desirable qualities. First, the new methods are unbiased formore » a broad class of problems. Second, the methods are applicable to nearly any physically relevant biochemical CTMC model. Third, and as we demonstrate on several numerical examples, the new methods are quite efficient, particularly if one wishes to estimate the full gradient of parametric sensitivities. The methods are rather intuitive and utilize the multilevel Monte Carlo philosophy of splitting an expectation into separate parts and handling each in an efficient manner.« less

  8. Automated daily processing of more than 1000 ground-based GPS receivers for studying intense ionospheric storms

    NASA Technical Reports Server (NTRS)

    Komjathy, Attila; Sparks, Lawrence; Wilson, Brian D.; Mannucci, Anthony J.

    2005-01-01

    To take advantage of the vast amount of GPS data, researchers use a number of techniques to estimate satellite and receiver interfrequency biases and the total electron content (TEC) of the ionosphere. Most techniques estimate vertical ionospheric structure and, simultaneously, hardware-related biases treated as nuisance parameters. These methods often are limited to 200 GPS receivers and use a sequential least squares or Kalman filter approach. The biases are later removed from the measurements to obtain unbiased TEC. In our approach to calibrating GPS receiver and transmitter interfrequency biases we take advantage of all available GPS receivers using a new processing algorithm based on the Global Ionospheric Mapping (GIM) software developed at the Jet Propulsion Laboratory. This new capability is designed to estimate receiver biases for all stations. We solve for the instrumental biases by modeling the ionospheric delay and removing it from the observation equation using precomputed GIM maps. The precomputed GIM maps rely on 200 globally distributed GPS receivers to establish the ''background'' used to model the ionosphere at the remaining 800 GPS sites.

  9. Parametric adaptive filtering and data validation in the bar GW detector AURIGA

    NASA Astrophysics Data System (ADS)

    Ortolan, A.; Baggio, L.; Cerdonio, M.; Prodi, G. A.; Vedovato, G.; Vitale, S.

    2002-04-01

    We report on our experience gained in the signal processing of the resonant GW detector AURIGA. Signal amplitude and arrival time are estimated by means of a matched-adaptive Wiener filter. The detector noise, entering in the filter set-up, is modelled as a parametric ARMA process; to account for slow non-stationarity of the noise, the ARMA parameters are estimated on an hourly basis. A requirement of the set-up of an unbiased Wiener filter is the separation of time spans with 'almost Gaussian' noise from non-Gaussian and/or strongly non-stationary time spans. The separation algorithm consists basically of a variance estimate with the Chauvenet convergence method and a threshold on the Curtosis index. The subsequent validation of data is strictly connected with the separation procedure: in fact, by injecting a large number of artificial GW signals into the 'almost Gaussian' part of the AURIGA data stream, we have demonstrated that the effective probability distributions of the signal-to-noise ratio χ2 and the time of arrival are those that are expected.

  10. Catastrophic photometric redshift errors: Weak-lensing survey requirements

    DOE PAGES

    Bernstein, Gary; Huterer, Dragan

    2010-01-11

    We study the sensitivity of weak lensing surveys to the effects of catastrophic redshift errors - cases where the true redshift is misestimated by a significant amount. To compute the biases in cosmological parameters, we adopt an efficient linearized analysis where the redshift errors are directly related to shifts in the weak lensing convergence power spectra. We estimate the number N spec of unbiased spectroscopic redshifts needed to determine the catastrophic error rate well enough that biases in cosmological parameters are below statistical errors of weak lensing tomography. While the straightforward estimate of N spec is ~10 6 we findmore » that using only the photometric redshifts with z ≤ 2.5 leads to a drastic reduction in N spec to ~ 30,000 while negligibly increasing statistical errors in dark energy parameters. Therefore, the size of spectroscopic survey needed to control catastrophic errors is similar to that previously deemed necessary to constrain the core of the z s – z p distribution. We also study the efficacy of the recent proposal to measure redshift errors by cross-correlation between the photo-z and spectroscopic samples. We find that this method requires ~ 10% a priori knowledge of the bias and stochasticity of the outlier population, and is also easily confounded by lensing magnification bias. In conclusion, the cross-correlation method is therefore unlikely to supplant the need for a complete spectroscopic redshift survey of the source population.« less

  11. Joint Bayesian Estimation of Quasar Continua and the Lyα Forest Flux Probability Distribution Function

    NASA Astrophysics Data System (ADS)

    Eilers, Anna-Christina; Hennawi, Joseph F.; Lee, Khee-Gan

    2017-08-01

    We present a new Bayesian algorithm making use of Markov Chain Monte Carlo sampling that allows us to simultaneously estimate the unknown continuum level of each quasar in an ensemble of high-resolution spectra, as well as their common probability distribution function (PDF) for the transmitted Lyα forest flux. This fully automated PDF regulated continuum fitting method models the unknown quasar continuum with a linear principal component analysis (PCA) basis, with the PCA coefficients treated as nuisance parameters. The method allows one to estimate parameters governing the thermal state of the intergalactic medium (IGM), such as the slope of the temperature-density relation γ -1, while marginalizing out continuum uncertainties in a fully Bayesian way. Using realistic mock quasar spectra created from a simplified semi-numerical model of the IGM, we show that this method recovers the underlying quasar continua to a precision of ≃ 7 % and ≃ 10 % at z = 3 and z = 5, respectively. Given the number of principal component spectra, this is comparable to the underlying accuracy of the PCA model itself. Most importantly, we show that we can achieve a nearly unbiased estimate of the slope γ -1 of the IGM temperature-density relation with a precision of +/- 8.6 % at z = 3 and +/- 6.1 % at z = 5, for an ensemble of ten mock high-resolution quasar spectra. Applying this method to real quasar spectra and comparing to a more realistic IGM model from hydrodynamical simulations would enable precise measurements of the thermal and cosmological parameters governing the IGM, albeit with somewhat larger uncertainties, given the increased flexibility of the model.

  12. TRUE MASSES OF RADIAL-VELOCITY EXOPLANETS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Robert A., E-mail: rbrown@stsci.edu

    We study the task of estimating the true masses of known radial-velocity (RV) exoplanets by means of direct astrometry on coronagraphic images to measure the apparent separation between exoplanet and host star. Initially, we assume perfect knowledge of the RV orbital parameters and that all errors are due to photon statistics. We construct design reference missions for four missions currently under study at NASA: EXO-S and WFIRST-S, with external star shades for starlight suppression, EXO-C and WFIRST-C, with internal coronagraphs. These DRMs reveal extreme scheduling constraints due to the combination of solar and anti-solar pointing restrictions, photometric and obscurational completeness,more » image blurring due to orbital motion, and the “nodal effect,” which is the independence of apparent separation and inclination when the planet crosses the plane of the sky through the host star. Next, we address the issue of nonzero uncertainties in RV orbital parameters by investigating their impact on the observations of 21 single-planet systems. Except for two—GJ 676 A b and 16 Cyg B b, which are observable only by the star-shade missions—we find that current uncertainties in orbital parameters generally prevent accurate, unbiased estimation of true planetary mass. For the coronagraphs, WFIRST-C and EXO-C, the most likely number of good estimators of true mass is currently zero. For the star shades, EXO-S and WFIRST-S, the most likely numbers of good estimators are three and four, respectively, including GJ 676 A b and 16 Cyg B b. We expect that uncertain orbital elements currently undermine all potential programs of direct imaging and spectroscopy of RV exoplanets.« less

  13. Fourier band-power E/B-mode estimators for cosmic shear

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Becker, Matthew R.; Rozo, Eduardo

    We introduce new Fourier band-power estimators for cosmic shear data analysis and E/B-mode separation. We consider both the case where one performs E/B-mode separation and the case where one does not. The resulting estimators have several nice properties which make them ideal for cosmic shear data analysis. First, they can be written as linear combinations of the binned cosmic shear correlation functions. Secondly, they account for the survey window function in real-space. Thirdly, they are unbiased by shape noise since they do not use correlation function data at zero separation. Fourthly, the band-power window functions in Fourier space are compactmore » and largely non-oscillatory. Fifthly, they can be used to construct band-power estimators with very efficient data compression properties. In particular, we find that all of the information on the parameters Ωm, σ8 and ns in the shear correlation functions in the range of ~10–400 arcmin for single tomographic bin can be compressed into only three band-power estimates. Finally, we can achieve these rates of data compression while excluding small-scale information where the modelling of the shear correlation functions and power spectra is very difficult. Given these desirable properties, these estimators will be very useful for cosmic shear data analysis.« less

  14. Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study

    PubMed Central

    Shah, Anoop D.; Bartlett, Jonathan W.; Carpenter, James; Nicholas, Owen; Hemingway, Harry

    2014-01-01

    Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The “true” imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001–2010) with complete data on all covariates. Variables were artificially made “missing at random,” and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data. PMID:24589914

  15. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.

    PubMed

    Shah, Anoop D; Bartlett, Jonathan W; Carpenter, James; Nicholas, Owen; Hemingway, Harry

    2014-03-15

    Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The "true" imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001-2010) with complete data on all covariates. Variables were artificially made "missing at random," and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data.

  16. Accounting for selection and correlation in the analysis of two-stage genome-wide association studies.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2016-10-01

    The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39: (7), 830-832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account. © The Author 2016. Published by Oxford University Press.

  17. Cell shape characterization and classification with discrete Fourier transforms and self-organizing maps.

    PubMed

    Kriegel, Fabian L; Köhler, Ralf; Bayat-Sarmadi, Jannike; Bayerl, Simon; Hauser, Anja E; Niesner, Raluca; Luch, Andreas; Cseresnyes, Zoltan

    2018-03-01

    Cells in their natural environment often exhibit complex kinetic behavior and radical adjustments of their shapes. This enables them to accommodate to short- and long-term changes in their surroundings under physiological and pathological conditions. Intravital multi-photon microscopy is a powerful tool to record this complex behavior. Traditionally, cell behavior is characterized by tracking the cells' movements, which yields numerous parameters describing the spatiotemporal characteristics of cells. Cells can be classified according to their tracking behavior using all or a subset of these kinetic parameters. This categorization can be supported by the a priori knowledge of experts. While such an approach provides an excellent starting point for analyzing complex intravital imaging data, faster methods are required for automated and unbiased characterization. In addition to their kinetic behavior, the 3D shape of these cells also provide essential clues about the cells' status and functionality. New approaches that include the study of cell shapes as well may also allow the discovery of correlations amongst the track- and shape-describing parameters. In the current study, we examine the applicability of a set of Fourier components produced by Discrete Fourier Transform (DFT) as a tool for more efficient and less biased classification of complex cell shapes. By carrying out a number of 3D-to-2D projections of surface-rendered cells, the applied method reduces the more complex 3D shape characterization to a series of 2D DFTs. The resulting shape factors are used to train a Self-Organizing Map (SOM), which provides an unbiased estimate for the best clustering of the data, thereby characterizing groups of cells according to their shape. We propose and demonstrate that such shape characterization is a powerful addition to, or a replacement for kinetic analysis. This would make it especially useful in situations where live kinetic imaging is less practical or not possible at all. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.

  18. Breathing dynamics based parameter sensitivity analysis of hetero-polymeric DNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Talukder, Srijeeta; Sen, Shrabani; Chaudhury, Pinaki, E-mail: pinakc@rediffmail.com

    We study the parameter sensitivity of hetero-polymeric DNA within the purview of DNA breathing dynamics. The degree of correlation between the mean bubble size and the model parameters is estimated for this purpose for three different DNA sequences. The analysis leads us to a better understanding of the sequence dependent nature of the breathing dynamics of hetero-polymeric DNA. Out of the 14 model parameters for DNA stability in the statistical Poland-Scheraga approach, the hydrogen bond interaction ε{sub hb}(AT) for an AT base pair and the ring factor ξ turn out to be the most sensitive parameters. In addition, the stackingmore » interaction ε{sub st}(TA-TA) for an TA-TA nearest neighbor pair of base-pairs is found to be the most sensitive one among all stacking interactions. Moreover, we also establish that the nature of stacking interaction has a deciding effect on the DNA breathing dynamics, not the number of times a particular stacking interaction appears in a sequence. We show that the sensitivity analysis can be used as an effective measure to guide a stochastic optimization technique to find the kinetic rate constants related to the dynamics as opposed to the case where the rate constants are measured using the conventional unbiased way of optimization.« less

  19. Estimation of treatment effects in all-comers randomized clinical trials with a predictive marker.

    PubMed

    Choai, Yuki; Matsui, Shigeyuki

    2015-03-01

    Recent advances in genomics and biotechnologies have accelerated the development of molecularly targeted treatments and accompanying markers to predict treatment responsiveness. However, it is common at the initiation of a definitive phase III clinical trial that there is no compelling biological basis or early trial data for a candidate marker regarding its capability in predicting treatment effects. In this case, it is reasonable to include all patients as eligible for randomization, but to plan for prospective subgroup analysis based on the marker. One analysis plan in such all-comers designs is the so-called fallback approach that first tests for overall treatment efficacy and then proceeds to testing in a biomarker-positive subgroup if the first test is not significant. In this approach, owing to the adaptive nature of the analysis and a correlation between the two tests, a bias will arise in estimating the treatment effect in the biomarker-positive subgroup after a non-significant first overall test. In this article, we formulate the bias function and show a difficulty in obtaining unbiased estimators for a whole range of an associated parameter. To address this issue, we propose bias-corrected estimation methods, including those based on an approximation of the bias function under a bounded range of the parameter using polynomials. We also provide an interval estimation method based on a bivariate doubly truncated normal distribution. Simulation experiments demonstrated a success in bias reduction. Application to a phase III trial for lung cancer is provided. © 2014, The International Biometric Society.

  20. Computational tools for exact conditional logistic regression.

    PubMed

    Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P

    Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.

  1. zBEAMS: a unified solution for supernova cosmology with redshift uncertainties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roberts, Ethan; Lochner, Michelle; Bassett, Bruce A.

    Supernova cosmology without spectra will be an important component of future surveys such as LSST. This lack of supernova spectra results in uncertainty in the redshifts which, if ignored, leads to significantly biased estimates of cosmological parameters. Here we present a hierarchical Bayesian formalism— zBEAMS—that addresses this problem by marginalising over the unknown or uncertain supernova redshifts to produce unbiased cosmological estimates that are competitive with supernova data with spectroscopically confirmed redshifts. zBEAMS provides a unified treatment of both photometric redshifts and host galaxy misidentification (occurring due to chance galaxy alignments or faint hosts), effectively correcting the inevitable contamination inmore » the Hubble diagram. Like its predecessor BEAMS, our formalism also takes care of non-Ia supernova contamination by marginalising over the unknown supernova type. We illustrate this technique with simulations of supernovae with photometric redshifts and host galaxy misidentification. A novel feature of the photometric redshift case is the important role played by the redshift distribution of the supernovae.« less

  2. The Atacama Cosmology Telescope: Calibration with the Wilkinson Microwave Anisotropy Probe Using Cross-Correlations

    NASA Technical Reports Server (NTRS)

    Hajian, Amir; Acquaviva, Viviana; Ade, Peter A. R.; Aguirre, Paula; Amiri, Mandana; Appel, John William; Barrientos, L. Felipe; Battistelli, Elia S.; Bond, John R.; Brown, Ben; hide

    2011-01-01

    We present a new calibration method based on cross-correlations with the Wilkinson Microwave Anisotropy Probe (WMAP) and apply it to data from the Atacama Cosmology Telescope (ACT). ACT's observing strategy and mapmaking procedure allows an unbiased reconstruction of the modes in the maps over a wide range of multipoles. By directly matching the ACT maps to WMAP observations in the multipole range of 400 < I < 1000, we determine the absolute calibration with an uncertainty of 2% in temperature. The precise measurement of the calibration error directly impacts the uncertainties in the cosmological parameters estimated from the ACT power spectra. We also present a combined map based on ACT and WMAP data that has a high signal-to-noise ratio over a wide range of multipoles.

  3. Multiple linear regression to estimate time-frequency electrophysiological responses in single trials

    PubMed Central

    Hu, L.; Zhang, Z.G.; Mouraux, A.; Iannetti, G.D.

    2015-01-01

    Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical oscillations, obtaining single-trial estimate of response latency, frequency, and magnitude. This permits within-subject statistical comparisons, correlation with pre-stimulus features, and integration of simultaneously-recorded EEG and fMRI. PMID:25665966

  4. On estimation in k-tree sampling

    Treesearch

    Christoph Kleinn; Frantisek Vilcko

    2007-01-01

    The plot design known as k-tree sampling involves taking the k nearest trees from a selected sample point as sample trees. While this plot design is very practical and easily applied in the field for moderate values of k, unbiased estimation remains a problem. In this article, we give a brief introduction to the...

  5. Complex sample survey estimation in static state-space

    Treesearch

    Raymond L. Czaplewski

    2010-01-01

    Increased use of remotely sensed data is a key strategy adopted by the Forest Inventory and Analysis Program. However, multiple sensor technologies require complex sampling units and sampling designs. The Recursive Restriction Estimator (RRE) accommodates this complexity. It is a design-consistent Empirical Best Linear Unbiased Prediction for the state-vector, which...

  6. EVALUATING PROBABILITY SAMPLING STRATEGIES FOR ESTIMATING REDD COUNTS: AN EXAMPLE WITH CHINOOK SALMON (Oncorhynchus tshawytscha)

    EPA Science Inventory

    Precise, unbiased estimates of population size are an essential tool for fisheries management. For a wide variety of salmonid fishes, redd counts from a sample of reaches are commonly used to monitor annual trends in abundance. Using a 9-year time series of georeferenced censuses...

  7. Actuarial considerations of medical malpractice evaluations in M&As.

    PubMed

    Frese, Richard C

    2014-11-01

    To best project an actuarial estimate for medical malpractice exposure for a merger and acquisition, a organization's leaders should consider the following factors, among others: How to support an unbiased actuarial estimation. Experience of the actuary. The full picture of the organization's malpractice coverage. The potential for future loss development. Frequency and severity trends.

  8. Intermediate-term forecasting of aftershocks from an early aftershock sequence: Bayesian and ensemble forecasting approaches

    NASA Astrophysics Data System (ADS)

    Omi, Takahiro; Ogata, Yosihiko; Hirata, Yoshito; Aihara, Kazuyuki

    2015-04-01

    Because aftershock occurrences can cause significant seismic risks for a considerable time after the main shock, prospective forecasting of the intermediate-term aftershock activity as soon as possible is important. The epidemic-type aftershock sequence (ETAS) model with the maximum likelihood estimate effectively reproduces general aftershock activity including secondary or higher-order aftershocks and can be employed for the forecasting. However, because we cannot always expect the accurate parameter estimation from incomplete early aftershock data where many events are missing, such forecasting using only a single estimated parameter set (plug-in forecasting) can frequently perform poorly. Therefore, we here propose Bayesian forecasting that combines the forecasts by the ETAS model with various probable parameter sets given the data. By conducting forecasting tests of 1 month period aftershocks based on the first 1 day data after the main shock as an example of the early intermediate-term forecasting, we show that the Bayesian forecasting performs better than the plug-in forecasting on average in terms of the log-likelihood score. Furthermore, to improve forecasting of large aftershocks, we apply a nonparametric (NP) model using magnitude data during the learning period and compare its forecasting performance with that of the Gutenberg-Richter (G-R) formula. We show that the NP forecast performs better than the G-R formula in some cases but worse in other cases. Therefore, robust forecasting can be obtained by employing an ensemble forecast that combines the two complementary forecasts. Our proposed method is useful for a stable unbiased intermediate-term assessment of aftershock probabilities.

  9. Breeding of Acrocomia aculeata using genetic diversity parameters and correlations to select accessions based on vegetative, phenological, and reproductive characteristics.

    PubMed

    Coser, S M; Motoike, S Y; Corrêa, T R; Pires, T P; Resende, M D V

    2016-10-17

    Macaw palm (Acrocomia aculeata) is a promising species for use in biofuel production, and establishing breeding programs is important for the development of commercial plantations. The aim of the present study was to analyze genetic diversity, verify correlations between traits, estimate genetic parameters, and select different accessions of A. aculeata in the Macaw Palm Germplasm Bank located in Universidade Federal de Viçosa, to develop a breeding program for this species. Accessions were selected based on precocity (PREC), total spathe (TS), diameter at breast height (DBH), height of the first spathe (HFS), and canopy area (CA). The traits were evaluated in 52 accessions during the 2012/2013 season and analyzed by restricted estimation maximum likelihood/best linear unbiased predictor procedures. Genetic diversity resulted in the formation of four groups by Tocher's clustering method. The correlation analysis showed it was possible to have indirect and early selection for the traits PREC and DBH. Estimated genetic parameters strengthened the genetic variability verified by cluster analysis. Narrow-sense heritability was classified as moderate (PREC, TS, and CA) to high (HFS and DBH), resulting in strong genetic control of the traits and success in obtaining genetic gains by selection. Accuracy values were classified as moderate (PREC and CA) to high (TS, HFS, and DBH), reinforcing the success of the selection process. Selection of accessions for PREC, TS, and HFS by the rank-average method permits selection gains of over 100%, emphasizing the successful use of the accessions in breeding programs and obtaining superior genotypes for commercial plantations.

  10. Gene–Environment Correlation: Difficulties and a Natural Experiment–Based Strategy

    PubMed Central

    Li, Jiang; Liu, Hexuan; Guo, Guang

    2013-01-01

    Objectives. We explored how gene–environment correlations can result in endogenous models, how natural experiments can protect against this threat, and if unbiased estimates from natural experiments are generalizable to other contexts. Methods. We compared a natural experiment, the College Roommate Study, which measured genes and behaviors of college students and their randomly assigned roommates in a southern public university, with observational data from the National Longitudinal Study of Adolescent Health in 2008. We predicted exposure to exercising peers using genetic markers and estimated environmental effects on alcohol consumption. A mixed-linear model estimated an alcohol consumption variance that was attributable to genetic markers and across peer environments. Results. Peer exercise environment was associated with respondent genotype in observational data, but not in the natural experiment. The effects of peer drinking and presence of a general gene–environment interaction were similar between data sets. Conclusions. Natural experiments, like random roommate assignment, could protect against potential bias introduced by gene–environment correlations. When combined with representative observational data, unbiased and generalizable causal effects could be estimated. PMID:23927502

  11. Combined Estimation of Hydrogeologic Conceptual Model and Parameter Uncertainty

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meyer, Philip D.; Ye, Ming; Neuman, Shlomo P.

    2004-03-01

    The objective of the research described in this report is the development and application of a methodology for comprehensively assessing the hydrogeologic uncertainties involved in dose assessment, including uncertainties associated with conceptual models, parameters, and scenarios. This report describes and applies a statistical method to quantitatively estimate the combined uncertainty in model predictions arising from conceptual model and parameter uncertainties. The method relies on model averaging to combine the predictions of a set of alternative models. Implementation is driven by the available data. When there is minimal site-specific data the method can be carried out with prior parameter estimates basedmore » on generic data and subjective prior model probabilities. For sites with observations of system behavior (and optionally data characterizing model parameters), the method uses model calibration to update the prior parameter estimates and model probabilities based on the correspondence between model predictions and site observations. The set of model alternatives can contain both simplified and complex models, with the requirement that all models be based on the same set of data. The method was applied to the geostatistical modeling of air permeability at a fractured rock site. Seven alternative variogram models of log air permeability were considered to represent data from single-hole pneumatic injection tests in six boreholes at the site. Unbiased maximum likelihood estimates of variogram and drift parameters were obtained for each model. Standard information criteria provided an ambiguous ranking of the models, which would not justify selecting one of them and discarding all others as is commonly done in practice. Instead, some of the models were eliminated based on their negligibly small updated probabilities and the rest were used to project the measured log permeabilities by kriging onto a rock volume containing the six boreholes. These four projections, and associated kriging variances, were averaged using the posterior model probabilities as weights. Finally, cross-validation was conducted by eliminating from consideration all data from one borehole at a time, repeating the above process, and comparing the predictive capability of the model-averaged result with that of each individual model. Using two quantitative measures of comparison, the model-averaged result was superior to any individual geostatistical model of log permeability considered.« less

  12. Comparison of estimates of hardwood bole volume using importance sampling, the centroid method, and some taper equations

    Treesearch

    Harry V., Jr. Wiant; Michael L. Spangler; John E. Baumgras

    2002-01-01

    Various taper systems and the centroid method were compared to unbiased volume estimates made by importance sampling for 720 hardwood trees selected throughout the state of West Virginia. Only the centroid method consistently gave volumes estimates that did not differ significantly from those made by importance sampling, although some taper equations did well for most...

  13. Unbiased mean direction of paleomagnetic data and better estimate of paleolatitude

    NASA Astrophysics Data System (ADS)

    Hatakeyama, T.; Shibuya, H.

    2010-12-01

    In paleomagnetism, when we obtain only paleodirection data without paleointensities we calculate Fisher-mean directions (I, D) and Fisher-mean VGP positions as the description of the mean field. However, Kono (1997) and Hatakeyama and Kono (2001) indicated that these averaged directions does not show the unbiased estimated mean directions derived from the time-averaged field (TAF). Hatakeyama and Kono (2002) calculated the TAF and paleosecular variation (PSV) models for the past 5My with considering the biases due to the averaging of the nonlinear functions such as the summation of the unit vectors in the Fisher statistics process. Here we will show a zonal TAF model based on the Hatakeyama and Kono TAF model. Moreover, we will introduce the biased angles due to the PSV in the mean direction and a method for determining true paleolatitudes, which represents the TAF, from paleodirections. This method will helps tectonics studies, especially in the estimation of the accurate paleolatitude in the middle latitude regions.

  14. Capture-Recapture Estimators in Epidemiology with Applications to Pertussis and Pneumococcal Invasive Disease Surveillance

    PubMed Central

    Braeye, Toon; Verheagen, Jan; Mignon, Annick; Flipse, Wim; Pierard, Denis; Huygen, Kris; Schirvel, Carole; Hens, Niel

    2016-01-01

    Introduction Surveillance networks are often not exhaustive nor completely complementary. In such situations, capture-recapture methods can be used for incidence estimation. The choice of estimator and their robustness with respect to the homogeneity and independence assumptions are however not well documented. Methods We investigated the performance of five different capture-recapture estimators in a simulation study. Eight different scenarios were used to detect and combine case-information. The scenarios increasingly violated assumptions of independence of samples and homogeneity of detection probabilities. Belgian datasets on invasive pneumococcal disease (IPD) and pertussis provided motivating examples. Results No estimator was unbiased in all scenarios. Performance of the parametric estimators depended on how much of the dependency and heterogeneity were correctly modelled. Model building was limited by parameter estimability, availability of additional information (e.g. covariates) and the possibilities inherent to the method. In the most complex scenario, methods that allowed for detection probabilities conditional on previous detections estimated the total population size within a 20–30% error-range. Parametric estimators remained stable if individual data sources lost up to 50% of their data. The investigated non-parametric methods were more susceptible to data loss and their performance was linked to the dependence between samples; overestimating in scenarios with little dependence, underestimating in others. Issues with parameter estimability made it impossible to model all suggested relations between samples for the IPD and pertussis datasets. For IPD, the estimates for the Belgian incidence for cases aged 50 years and older ranged from 44 to58/100,000 in 2010. The estimates for pertussis (all ages, Belgium, 2014) ranged from 24.2 to30.8/100,000. Conclusion We encourage the use of capture-recapture methods, but epidemiologists should preferably include datasets for which the underlying dependency structure is not too complex, a priori investigate this structure, compensate for it within the model and interpret the results with the remaining unmodelled heterogeneity in mind. PMID:27529167

  15. Testing assumptions for unbiased estimation of survival of radiomarked harlequin ducks

    USGS Publications Warehouse

    Esler, Daniel N.; Mulcahy, Daniel M.; Jarvis, Robert L.

    2000-01-01

    Unbiased estimates of survival based on individuals outfitted with radiotransmitters require meeting the assumptions that radios do not affect survival, and animals for which the radio signal is lost have the same survival probability as those for which fate is known. In most survival studies, researchers have made these assumptions without testing their validity. We tested these assumptions by comparing interannual recapture rates (and, by inference, survival) between radioed and unradioed adult female harlequin ducks (Histrionicus histrionicus), and for radioed females, between right-censored birds (i.e., those for which the radio signal was lost during the telemetry monitoring period) and birds with known fates. We found that recapture rates of birds equipped with implanted radiotransmitters (21.6 ± 3.0%; x̄ ± SE) were similar to unradioed birds (21.7 ± 8.6%), suggesting that radios did not affect survival. Recapture rates also were similar between right-censored (20.6 ± 5.1%) and known-fate individuals (22.1 ± 3.8%), suggesting that missing birds were not subject to differential mortality. We also determined that capture and handling resulted in short-term loss of body mass for both radioed and unradioed females and that this effect was more pronounced for radioed birds (the difference between groups was 15.4 ± 7.1 g). However, no difference existed in body mass after recapture 1 year later. Our study suggests that implanted radios are an unbiased method for estimating survival of harlequin ducks and likely other species under similar circumstances.

  16. A comparison of the weights-of-evidence method and probabilistic neural networks

    USGS Publications Warehouse

    Singer, Donald A.; Kouda, Ryoichi

    1999-01-01

    The need to integrate large quantities of digital geoscience information to classify locations as mineral deposits or nondeposits has been met by the weights-of-evidence method in many situations. Widespread selection of this method may be more the result of its ease of use and interpretation rather than comparisons with alternative methods. A comparison of the weights-of-evidence method to probabilistic neural networks is performed here with data from Chisel Lake-Andeson Lake, Manitoba, Canada. Each method is designed to estimate the probability of belonging to learned classes where the estimated probabilities are used to classify the unknowns. Using these data, significantly lower classification error rates were observed for the neural network, not only when test and training data were the same (0.02 versus 23%), but also when validation data, not used in any training, were used to test the efficiency of classification (0.7 versus 17%). Despite these data containing too few deposits, these tests of this set of data demonstrate the neural network's ability at making unbiased probability estimates and lower error rates when measured by number of polygons or by the area of land misclassified. For both methods, independent validation tests are required to ensure that estimates are representative of real-world results. Results from the weights-of-evidence method demonstrate a strong bias where most errors are barren areas misclassified as deposits. The weights-of-evidence method is based on Bayes rule, which requires independent variables in order to make unbiased estimates. The chi-square test for independence indicates no significant correlations among the variables in the Chisel Lake–Andeson Lake data. However, the expected number of deposits test clearly demonstrates that these data violate the independence assumption. Other, independent simulations with three variables show that using variables with correlations of 1.0 can double the expected number of deposits as can correlations of −1.0. Studies done in the 1970s on methods that use Bayes rule show that moderate correlations among attributes seriously affect estimates and even small correlations lead to increases in misclassifications. Adverse effects have been observed with small to moderate correlations when only six to eight variables were used. Consistent evidence of upward biased probability estimates from multivariate methods founded on Bayes rule must be of considerable concern to institutions and governmental agencies where unbiased estimates are required. In addition to increasing the misclassification rate, biased probability estimates make classification into deposit and nondeposit classes an arbitrary subjective decision. The probabilistic neural network has no problem dealing with correlated variables—its performance depends strongly on having a thoroughly representative training set. Probabilistic neural networks or logistic regression should receive serious consideration where unbiased estimates are required. The weights-of-evidence method would serve to estimate thresholds between anomalies and background and for exploratory data analysis.

  17. Double sampling to estimate density and population trends in birds

    USGS Publications Warehouse

    Bart, Jonathan; Earnst, Susan L.

    2002-01-01

    We present a method for estimating density of nesting birds based on double sampling. The approach involves surveying a large sample of plots using a rapid method such as uncorrected point counts, variable circular plot counts, or the recently suggested double-observer method. A subsample of those plots is also surveyed using intensive methods to determine actual density. The ratio of the mean count on those plots (using the rapid method) to the mean actual density (as determined by the intensive searches) is used to adjust results from the rapid method. The approach works well when results from the rapid method are highly correlated with actual density. We illustrate the method with three years of shorebird surveys from the tundra in northern Alaska. In the rapid method, surveyors covered ~10 ha h-1 and surveyed each plot a single time. The intensive surveys involved three thorough searches, required ~3 h ha-1, and took 20% of the study effort. Surveyors using the rapid method detected an average of 79% of birds present. That detection ratio was used to convert the index obtained in the rapid method into an essentially unbiased estimate of density. Trends estimated from several years of data would also be essentially unbiased. Other advantages of double sampling are that (1) the rapid method can be changed as new methods become available, (2) domains can be compared even if detection rates differ, (3) total population size can be estimated, and (4) valuable ancillary information (e.g. nest success) can be obtained on intensive plots with little additional effort. We suggest that double sampling be used to test the assumption that rapid methods, such as variable circular plot and double-observer methods, yield density estimates that are essentially unbiased. The feasibility of implementing double sampling in a range of habitats needs to be evaluated.

  18. Unbiased contaminant removal for 3D galaxy power spectrum measurements

    NASA Astrophysics Data System (ADS)

    Kalus, B.; Percival, W. J.; Bacon, D. J.; Samushia, L.

    2016-11-01

    We assess and develop techniques to remove contaminants when calculating the 3D galaxy power spectrum. We separate the process into three separate stages: (I) removing the contaminant signal, (II) estimating the uncontaminated cosmological power spectrum and (III) debiasing the resulting estimates. For (I), we show that removing the best-fitting contaminant (mode subtraction) and setting the contaminated components of the covariance to be infinite (mode deprojection) are mathematically equivalent. For (II), performing a quadratic maximum likelihood (QML) estimate after mode deprojection gives an optimal unbiased solution, although it requires the manipulation of large N_mode^2 matrices (Nmode being the total number of modes), which is unfeasible for recent 3D galaxy surveys. Measuring a binned average of the modes for (II) as proposed by Feldman, Kaiser & Peacock (FKP) is faster and simpler, but is sub-optimal and gives rise to a biased solution. We present a method to debias the resulting FKP measurements that does not require any large matrix calculations. We argue that the sub-optimality of the FKP estimator compared with the QML estimator, caused by contaminants, is less severe than that commonly ignored due to the survey window.

  19. Can We Spin Straw Into Gold? An Evaluation of Immigrant Legal Status Imputation Approaches

    PubMed Central

    Van Hook, Jennifer; Bachmeier, James D.; Coffman, Donna; Harel, Ofer

    2014-01-01

    Researchers have developed logical, demographic, and statistical strategies for imputing immigrants’ legal status, but these methods have never been empirically assessed. We used Monte Carlo simulations to test whether, and under what conditions, legal status imputation approaches yield unbiased estimates of the association of unauthorized status with health insurance coverage. We tested five methods under a range of missing data scenarios. Logical and demographic imputation methods yielded biased estimates across all missing data scenarios. Statistical imputation approaches yielded unbiased estimates only when unauthorized status was jointly observed with insurance coverage; when this condition was not met, these methods overestimated insurance coverage for unauthorized relative to legal immigrants. We next showed how bias can be reduced by incorporating prior information about unauthorized immigrants. Finally, we demonstrated the utility of the best-performing statistical method for increasing power. We used it to produce state/regional estimates of insurance coverage among unauthorized immigrants in the Current Population Survey, a data source that contains no direct measures of immigrants’ legal status. We conclude that commonly employed legal status imputation approaches are likely to produce biased estimates, but data and statistical methods exist that could substantially reduce these biases. PMID:25511332

  20. Fitting a defect non-linear model with or without prior, distinguishing nuclear reaction products as an example.

    PubMed

    Helgesson, P; Sjöstrand, H

    2017-11-01

    Fitting a parametrized function to data is important for many researchers and scientists. If the model is non-linear and/or defect, it is not trivial to do correctly and to include an adequate uncertainty analysis. This work presents how the Levenberg-Marquardt algorithm for non-linear generalized least squares fitting can be used with a prior distribution for the parameters and how it can be combined with Gaussian processes to treat model defects. An example, where three peaks in a histogram are to be distinguished, is carefully studied. In particular, the probability r 1 for a nuclear reaction to end up in one out of two overlapping peaks is studied. Synthetic data are used to investigate effects of linearizations and other assumptions. For perfect Gaussian peaks, it is seen that the estimated parameters are distributed close to the truth with good covariance estimates. This assumes that the method is applied correctly; for example, prior knowledge should be implemented using a prior distribution and not by assuming that some parameters are perfectly known (if they are not). It is also important to update the data covariance matrix using the fit if the uncertainties depend on the expected value of the data (e.g., for Poisson counting statistics or relative uncertainties). If a model defect is added to the peaks, such that their shape is unknown, a fit which assumes perfect Gaussian peaks becomes unable to reproduce the data, and the results for r 1 become biased. It is, however, seen that it is possible to treat the model defect with a Gaussian process with a covariance function tailored for the situation, with hyper-parameters determined by leave-one-out cross validation. The resulting estimates for r 1 are virtually unbiased, and the uncertainty estimates agree very well with the underlying uncertainty.

  1. Fitting a defect non-linear model with or without prior, distinguishing nuclear reaction products as an example

    NASA Astrophysics Data System (ADS)

    Helgesson, P.; Sjöstrand, H.

    2017-11-01

    Fitting a parametrized function to data is important for many researchers and scientists. If the model is non-linear and/or defect, it is not trivial to do correctly and to include an adequate uncertainty analysis. This work presents how the Levenberg-Marquardt algorithm for non-linear generalized least squares fitting can be used with a prior distribution for the parameters and how it can be combined with Gaussian processes to treat model defects. An example, where three peaks in a histogram are to be distinguished, is carefully studied. In particular, the probability r1 for a nuclear reaction to end up in one out of two overlapping peaks is studied. Synthetic data are used to investigate effects of linearizations and other assumptions. For perfect Gaussian peaks, it is seen that the estimated parameters are distributed close to the truth with good covariance estimates. This assumes that the method is applied correctly; for example, prior knowledge should be implemented using a prior distribution and not by assuming that some parameters are perfectly known (if they are not). It is also important to update the data covariance matrix using the fit if the uncertainties depend on the expected value of the data (e.g., for Poisson counting statistics or relative uncertainties). If a model defect is added to the peaks, such that their shape is unknown, a fit which assumes perfect Gaussian peaks becomes unable to reproduce the data, and the results for r1 become biased. It is, however, seen that it is possible to treat the model defect with a Gaussian process with a covariance function tailored for the situation, with hyper-parameters determined by leave-one-out cross validation. The resulting estimates for r1 are virtually unbiased, and the uncertainty estimates agree very well with the underlying uncertainty.

  2. Biases in Metallicity Measurements from Global Galaxy Spectra: The Effects of Flux Weighting and Diffuse Ionized Gas Contamination

    NASA Astrophysics Data System (ADS)

    Sanders, Ryan L.; Shapley, Alice E.; Zhang, Kai; Yan, Renbin

    2017-12-01

    Galaxy metallicity scaling relations provide a powerful tool for understanding galaxy evolution, but obtaining unbiased global galaxy gas-phase oxygen abundances requires proper treatment of the various line-emitting sources within spectroscopic apertures. We present a model framework that treats galaxies as ensembles of H II and diffuse ionized gas (DIG) regions of varying metallicities. These models are based upon empirical relations between line ratios and electron temperature for H II regions, and DIG strong-line ratio relations from SDSS-IV MaNGA IFU data. Flux-weighting effects and DIG contamination can significantly affect properties inferred from global galaxy spectra, biasing metallicity estimates by more than 0.3 dex in some cases. We use observationally motivated inputs to construct a model matched to typical local star-forming galaxies, and quantify the biases in strong-line ratios, electron temperatures, and direct-method metallicities as inferred from global galaxy spectra relative to the median values of the H II region distributions in each galaxy. We also provide a generalized set of models that can be applied to individual galaxies or galaxy samples in atypical regions of parameter space. We use these models to correct for the effects of flux-weighting and DIG contamination in the local direct-method mass-metallicity and fundamental metallicity relations, and in the mass-metallicity relation based on strong-line metallicities. Future photoionization models of galaxy line emission need to include DIG emission and represent galaxies as ensembles of emitting regions with varying metallicity, instead of as single H II regions with effective properties, in order to obtain unbiased estimates of key underlying physical properties.

  3. Network topology and parameter estimation: from experimental design methods to gene regulatory network kinetics using a community based approach

    PubMed Central

    2014-01-01

    Background Accurate estimation of parameters of biochemical models is required to characterize the dynamics of molecular processes. This problem is intimately linked to identifying the most informative experiments for accomplishing such tasks. While significant progress has been made, effective experimental strategies for parameter identification and for distinguishing among alternative network topologies remain unclear. We approached these questions in an unbiased manner using a unique community-based approach in the context of the DREAM initiative (Dialogue for Reverse Engineering Assessment of Methods). We created an in silico test framework under which participants could probe a network with hidden parameters by requesting a range of experimental assays; results of these experiments were simulated according to a model of network dynamics only partially revealed to participants. Results We proposed two challenges; in the first, participants were given the topology and underlying biochemical structure of a 9-gene regulatory network and were asked to determine its parameter values. In the second challenge, participants were given an incomplete topology with 11 genes and asked to find three missing links in the model. In both challenges, a budget was provided to buy experimental data generated in silico with the model and mimicking the features of different common experimental techniques, such as microarrays and fluorescence microscopy. Data could be bought at any stage, allowing participants to implement an iterative loop of experiments and computation. Conclusions A total of 19 teams participated in this competition. The results suggest that the combination of state-of-the-art parameter estimation and a varied set of experimental methods using a few datasets, mostly fluorescence imaging data, can accurately determine parameters of biochemical models of gene regulation. However, the task is considerably more difficult if the gene network topology is not completely defined, as in challenge 2. Importantly, we found that aggregating independent parameter predictions and network topology across submissions creates a solution that can be better than the one from the best-performing submission. PMID:24507381

  4. Estimation of distribution overlap of urn models.

    PubMed

    Hampton, Jerrad; Lladser, Manuel E

    2012-01-01

    A classical problem in statistics is estimating the expected coverage of a sample, which has had applications in gene expression, microbial ecology, optimization, and even numismatics. Here we consider a related extension of this problem to random samples of two discrete distributions. Specifically, we estimate what we call the dissimilarity probability of a sample, i.e., the probability of a draw from one distribution not being observed in [Formula: see text] draws from another distribution. We show our estimator of dissimilarity to be a [Formula: see text]-statistic and a uniformly minimum variance unbiased estimator of dissimilarity over the largest appropriate range of [Formula: see text]. Furthermore, despite the non-Markovian nature of our estimator when applied sequentially over [Formula: see text], we show it converges uniformly in probability to the dissimilarity parameter, and we present criteria when it is approximately normally distributed and admits a consistent jackknife estimator of its variance. As proof of concept, we analyze V35 16S rRNA data to discern between various microbial environments. Other potential applications concern any situation where dissimilarity of two discrete distributions may be of interest. For instance, in SELEX experiments, each urn could represent a random RNA pool and each draw a possible solution to a particular binding site problem over that pool. The dissimilarity of these pools is then related to the probability of finding binding site solutions in one pool that are absent in the other.

  5. Space-Time Data Fusion

    NASA Technical Reports Server (NTRS)

    Braverman, Amy; Nguyen, Hai; Olsen, Edward; Cressie, Noel

    2011-01-01

    Space-time Data Fusion (STDF) is a methodology for combing heterogeneous remote sensing data to optimally estimate the true values of a geophysical field of interest, and obtain uncertainties for those estimates. The input data sets may have different observing characteristics including different footprints, spatial resolutions and fields of view, orbit cycles, biases, and noise characteristics. Despite these differences all observed data can be linked to the underlying field, and therefore the each other, by a statistical model. Differences in footprints and other geometric characteristics are accounted for by parameterizing pixel-level remote sensing observations as spatial integrals of true field values lying within pixel boundaries, plus measurement error. Both spatial and temporal correlations in the true field and in the observations are estimated and incorporated through the use of a space-time random effects (STRE) model. Once the models parameters are estimated, we use it to derive expressions for optimal (minimum mean squared error and unbiased) estimates of the true field at any arbitrary location of interest, computed from the observations. Standard errors of these estimates are also produced, allowing confidence intervals to be constructed. The procedure is carried out on a fine spatial grid to approximate a continuous field. We demonstrate STDF by applying it to the problem of estimating CO2 concentration in the lower-atmosphere using data from the Atmospheric Infrared Sounder (AIRS) and the Japanese Greenhouse Gasses Observing Satellite (GOSAT) over one year for the continental US.

  6. On fixed-area plot sampling for downed coarse woody debris

    Treesearch

    Jeffrey H. Gove; Paul C. Van Deusen

    2011-01-01

    The use of fixed-area plots for sampling down coarse woody debris is reviewed. A set of clearly defined protocols for two previously described methods is established and a new method, which we call the 'sausage' method, is developed. All methods (protocols) are shown to be unbiased for volume estimation, but not necessarily for estimation of population...

  7. Computation of rainfall erosivity from daily precipitation amounts.

    PubMed

    Beguería, Santiago; Serrano-Notivoli, Roberto; Tomas-Burguera, Miquel

    2018-10-01

    Rainfall erosivity is an important parameter in many erosion models, and the EI30 defined by the Universal Soil Loss Equation is one of the best known erosivity indices. One issue with this and other erosivity indices is that they require continuous breakpoint, or high frequency time interval, precipitation data. These data are rare, in comparison to more common medium-frequency data, such as daily precipitation data commonly recorded by many national and regional weather services. Devising methods for computing estimates of rainfall erosivity from daily precipitation data that are comparable to those obtained by using high-frequency data is, therefore, highly desired. Here we present a method for producing such estimates, based on optimal regression tools such as the Gamma Generalised Linear Model and universal kriging. Unlike other methods, this approach produces unbiased and very close to observed EI30, especially when these are aggregated at the annual level. We illustrate the method with a case study comprising more than 1500 high-frequency precipitation records across Spain. Although the original records have a short span (the mean length is around 10 years), computation of spatially-distributed upscaling parameters offers the possibility to compute high-resolution climatologies of the EI30 index based on currently available, long-span, daily precipitation databases. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. An accessible method for implementing hierarchical models with spatio-temporal abundance data

    USGS Publications Warehouse

    Ross, Beth E.; Hooten, Melvin B.; Koons, David N.

    2012-01-01

    A common goal in ecology and wildlife management is to determine the causes of variation in population dynamics over long periods of time and across large spatial scales. Many assumptions must nevertheless be overcome to make appropriate inference about spatio-temporal variation in population dynamics, such as autocorrelation among data points, excess zeros, and observation error in count data. To address these issues, many scientists and statisticians have recommended the use of Bayesian hierarchical models. Unfortunately, hierarchical statistical models remain somewhat difficult to use because of the necessary quantitative background needed to implement them, or because of the computational demands of using Markov Chain Monte Carlo algorithms to estimate parameters. Fortunately, new tools have recently been developed that make it more feasible for wildlife biologists to fit sophisticated hierarchical Bayesian models (i.e., Integrated Nested Laplace Approximation, ‘INLA’). We present a case study using two important game species in North America, the lesser and greater scaup, to demonstrate how INLA can be used to estimate the parameters in a hierarchical model that decouples observation error from process variation, and accounts for unknown sources of excess zeros as well as spatial and temporal dependence in the data. Ultimately, our goal was to make unbiased inference about spatial variation in population trends over time.

  9. Mixed models for selection of Jatropha progenies with high adaptability and yield stability in Brazilian regions.

    PubMed

    Teodoro, P E; Bhering, L L; Costa, R D; Rocha, R B; Laviola, B G

    2016-08-19

    The aim of this study was to estimate genetic parameters via mixed models and simultaneously to select Jatropha progenies grown in three regions of Brazil that meet high adaptability and stability. From a previous phenotypic selection, three progeny tests were installed in 2008 in the municipalities of Planaltina-DF (Midwest), Nova Porteirinha-MG (Southeast), and Pelotas-RS (South). We evaluated 18 families of half-sib in a randomized block design with three replications. Genetic parameters were estimated using restricted maximum likelihood/best linear unbiased prediction. Selection was based on the harmonic mean of the relative performance of genetic values method in three strategies considering: 1) performance in each environment (with interaction effect); 2) performance in each environment (with interaction effect); and 3) simultaneous selection for grain yield, stability and adaptability. Accuracy obtained (91%) reveals excellent experimental quality and consequently safety and credibility in the selection of superior progenies for grain yield. The gain with the selection of the best five progenies was more than 20%, regardless of the selection strategy. Thus, based on the three selection strategies used in this study, the progenies 4, 11, and 3 (selected in all environments and the mean environment and by adaptability and phenotypic stability methods) are the most suitable for growing in the three regions evaluated.

  10. Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics.

    PubMed

    Keich, Uri; Kertesz-Farkas, Attila; Noble, William Stafford

    2015-08-07

    Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Käll et al. Our analysis reveals significant biases in the two separate target-decoy search protocols. Moreover, the one TDC protocol that provides an unbiased FDR estimate among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications.

  11. How Accurate Are Infrared Luminosities from Monochromatic Photometric Extrapolation?

    NASA Astrophysics Data System (ADS)

    Lin, Zesen; Fang, Guanwen; Kong, Xu

    2016-12-01

    Template-based extrapolations from only one photometric band can be a cost-effective method to estimate the total infrared (IR) luminosities ({L}{IR}) of galaxies. By utilizing multi-wavelength data that covers across 0.35-500 μm in GOODS-North and GOODS-South fields, we investigate the accuracy of this monochromatic extrapolated {L}{IR} based on three IR spectral energy distribution (SED) templates out to z˜ 3.5. We find that the Chary & Elbaz template provides the best estimate of {L}{IR} in Herschel/Photodetector Array Camera and Spectrometer (PACS) bands, while the Dale & Helou template performs best in Herschel/Spectral and Photometric Imaging Receiver (SPIRE) bands. To estimate {L}{IR}, we suggest that extrapolations from the available longest wavelength PACS band based on the Chary & Elbaz template can be a good estimator. Moreover, if the PACS measurement is unavailable, extrapolations from SPIRE observations but based on the Dale & Helou template can also provide a statistically unbiased estimate for galaxies at z≲ 2. The emission with a rest-frame 10-100 μm range of IR SED can be well described by all three templates, but only the Dale & Helou template shows a nearly unbiased estimate of the emission of the rest-frame submillimeter part.

  12. Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics

    PubMed Central

    2016-01-01

    Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Käll et al. Our analysis reveals significant biases in the two separate target-decoy search protocols. Moreover, the one TDC protocol that provides an unbiased FDR estimate among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications. PMID:26152888

  13. Unbiased free energy estimates in fast nonequilibrium transformations using Gaussian mixtures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Procacci, Piero

    2015-04-21

    In this paper, we present an improved method for obtaining unbiased estimates of the free energy difference between two thermodynamic states using the work distribution measured in nonequilibrium driven experiments connecting these states. The method is based on the assumption that any observed work distribution is given by a mixture of Gaussian distributions, whose normal components are identical in either direction of the nonequilibrium process, with weights regulated by the Crooks theorem. Using the prototypical example for the driven unfolding/folding of deca-alanine, we show that the predicted behavior of the forward and reverse work distributions, assuming a combination of onlymore » two Gaussian components with Crooks derived weights, explains surprisingly well the striking asymmetry in the observed distributions at fast pulling speeds. The proposed methodology opens the way for a perfectly parallel implementation of Jarzynski-based free energy calculations in complex systems.« less

  14. Best Linear Unbiased Prediction (BLUP) for regional yield trials: a comparison to additive main effects and multiplicative interaction (AMMI) analysis.

    PubMed

    Piepho, H P

    1994-11-01

    Multilocation trials are often used to analyse the adaptability of genotypes in different environments and to find for each environment the genotype that is best adapted; i.e. that is highest yielding in that environment. For this purpose, it is of interest to obtain a reliable estimate of the mean yield of a cultivar in a given environment. This article compares two different statistical estimation procedures for this task: the Additive Main Effects and Multiplicative Interaction (AMMI) analysis and Best Linear Unbiased Prediction (BLUP). A modification of a cross validation procedure commonly used with AMMI is suggested for trials that are laid out as a randomized complete block design. The use of these procedure is exemplified using five faba bean datasets from German registration trails. BLUP was found to outperform AMMI in four of five faba bean datasets.

  15. A Multiphase Model for the Intracluster Medium

    NASA Technical Reports Server (NTRS)

    Nagai, Daisuke; Sulkanen, Martin E.; Evrard, August E.

    1999-01-01

    Constraints on the clustered mass density of the universe derived from the observed population mean intracluster gas fraction of x-ray clusters may be biased by reliance on a single-phase assumption for the thermodynamic structure of the intracluster medium (ICM). We propose a descriptive model for multiphase structure in which a spherically symmetric ICM contains isobaric density perturbations with a radially dependent variance. Fixing the x-ray emission and emission weighted temperature, we explore two independently observable signatures of the model in the parameter space. For bremsstrahlung dominated emission, the central Sunyaev-Zel'dovich (SZ) decrement in the multiphase case is increased over the single-phase case and multiphase x-ray spectra in the range 0.1-20 keV are flatter in the continuum and exhibit stronger low energy emission lines than their single-phase counterpart. We quantify these effects for a fiducial 10e8 K cluster and demonstrate how the combination of SZ and x-ray spectroscopy can be used to identify a preferred location in the plane of the model parameter space. From these parameters the correct value of mean intracluster gas fraction in the multiphase model results, allowing an unbiased estimate of clustered mass density to he recovered.

  16. EVOLUTION OF THE VELOCITY-DISPERSION FUNCTION OF LUMINOUS RED GALAXIES: A HIERARCHICAL BAYESIAN MEASUREMENT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shu Yiping; Bolton, Adam S.; Dawson, Kyle S.

    2012-04-15

    We present a hierarchical Bayesian determination of the velocity-dispersion function of approximately 430,000 massive luminous red galaxies observed at relatively low spectroscopic signal-to-noise ratio (S/N {approx} 3-5 per 69 km s{sup -1}) by the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey III. We marginalize over spectroscopic redshift errors, and use the full velocity-dispersion likelihood function for each galaxy to make a self-consistent determination of the velocity-dispersion distribution parameters as a function of absolute magnitude and redshift, correcting as well for the effects of broadband magnitude errors on our binning. Parameterizing the distribution at each point inmore » the luminosity-redshift plane with a log-normal form, we detect significant evolution in the width of the distribution toward higher intrinsic scatter at higher redshifts. Using a subset of deep re-observations of BOSS galaxies, we demonstrate that our distribution-parameter estimates are unbiased regardless of spectroscopic S/N. We also show through simulation that our method introduces no systematic parameter bias with redshift. We highlight the advantage of the hierarchical Bayesian method over frequentist 'stacking' of spectra, and illustrate how our measured distribution parameters can be adopted as informative priors for velocity-dispersion measurements from individual noisy spectra.« less

  17. Bayesian Model Selection in Geophysics: The evidence

    NASA Astrophysics Data System (ADS)

    Vrugt, J. A.

    2016-12-01

    Bayesian inference has found widespread application and use in science and engineering to reconcile Earth system models with data, including prediction in space (interpolation), prediction in time (forecasting), assimilation of observations and deterministic/stochastic model output, and inference of the model parameters. Per Bayes theorem, the posterior probability, , P(H|D), of a hypothesis, H, given the data D, is equivalent to the product of its prior probability, P(H), and likelihood, L(H|D), divided by a normalization constant, P(D). In geophysics, the hypothesis, H, often constitutes a description (parameterization) of the subsurface for some entity of interest (e.g. porosity, moisture content). The normalization constant, P(D), is not required for inference of the subsurface structure, yet of great value for model selection. Unfortunately, it is not particularly easy to estimate P(D) in practice. Here, I will introduce the various building blocks of a general purpose method which provides robust and unbiased estimates of the evidence, P(D). This method uses multi-dimensional numerical integration of the posterior (parameter) distribution. I will then illustrate this new estimator by application to three competing subsurface models (hypothesis) using GPR travel time data from the South Oyster Bacterial Transport Site, in Virginia, USA. The three subsurface models differ in their treatment of the porosity distribution and use (a) horizontal layering with fixed layer thicknesses, (b) vertical layering with fixed layer thicknesses and (c) a multi-Gaussian field. The results of the new estimator are compared against the brute force Monte Carlo method, and the Laplace-Metropolis method.

  18. Exploring activity-driven network with biased walks

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Wu, Ding Juan; Lv, Fang; Su, Meng Long

    We investigate the concurrent dynamics of biased random walks and the activity-driven network, where the preferential transition probability is in terms of the edge-weighting parameter. We also obtain the analytical expressions for stationary distribution and the coverage function in directed and undirected networks, all of which depend on the weight parameter. Appropriately adjusting this parameter, more effective search strategy can be obtained when compared with the unbiased random walk, whether in directed or undirected networks. Since network weights play a significant role in the diffusion process.

  19. Gap-filling methods to impute eddy covariance flux data by preserving variance.

    NASA Astrophysics Data System (ADS)

    Kunwor, S.; Staudhammer, C. L.; Starr, G.; Loescher, H. W.

    2015-12-01

    To represent carbon dynamics, in terms of exchange of CO2 between the terrestrial ecosystem and the atmosphere, eddy covariance (EC) data has been collected using eddy flux towers from various sites across globe for more than two decades. However, measurements from EC data are missing for various reasons: precipitation, routine maintenance, or lack of vertical turbulence. In order to have estimates of net ecosystem exchange of carbon dioxide (NEE) with high precision and accuracy, robust gap-filling methods to impute missing data are required. While the methods used so far have provided robust estimates of the mean value of NEE, little attention has been paid to preserving the variance structures embodied by the flux data. Preserving the variance of these data will provide unbiased and precise estimates of NEE over time, which mimic natural fluctuations. We used a non-linear regression approach with moving windows of different lengths (15, 30, and 60-days) to estimate non-linear regression parameters for one year of flux data from a long-leaf pine site at the Joseph Jones Ecological Research Center. We used as our base the Michaelis-Menten and Van't Hoff functions. We assessed the potential physiological drivers of these parameters with linear models using micrometeorological predictors. We then used a parameter prediction approach to refine the non-linear gap-filling equations based on micrometeorological conditions. This provides us an opportunity to incorporate additional variables, such as vapor pressure deficit (VPD) and volumetric water content (VWC) into the equations. Our preliminary results indicate that improvements in gap-filling can be gained with a 30-day moving window with additional micrometeorological predictors (as indicated by lower root mean square error (RMSE) of the predicted values of NEE). Our next steps are to use these parameter predictions from moving windows to gap-fill the data with and without incorporation of potential driver variables of the parameters traditionally used. Then, comparisons of the predicted values from these methods and 'traditional' gap-filling methods (using 12 fixed monthly windows) will be assessed to show the scale of preserving variance. Further, this method will be applied to impute artificially created gaps for analyzing if variance is preserved.

  20. [The application of stereology in radiology imaging and cell biology fields].

    PubMed

    Hu, Na; Wang, Yan; Feng, Yuanming; Lin, Wang

    2012-08-01

    Stereology is an interdisciplinary method for 3D morphological study developed from mathematics and morphology. It is widely used in medical image analysis and cell biology studies. Because of its unbiased, simple, fast, reliable and non-invasive characteristics, stereology has been widely used in biomedical areas for quantitative analysis and statistics, such as histology, pathology and medical imaging. Because the stereological parameters show distinct differences in different pathology, many scholars use stereological methods to do quantitative analysis in their studies in recent years, for example, in the areas of the condition of cancer cells, tumor grade, disease development and the patient's prognosis, etc. This paper describes the stereological concept and estimation methods, also illustrates the applications of stereology in the fields of CT images, MRI images and cell biology, and finally reflects the universality, the superiority and reliability of stereology.

  1. Experiences from the testing of a theory for modelling groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Christensen, S.; Cooley, R.L.

    2002-01-01

    Usually, small-scale model error is present in groundwater modelling because the model only represents average system characteristics having the same form as the drift and small-scale variability is neglected. These errors cause the true errors of a regression model to be correlated. Theory and an example show that the errors also contribute to bias in the estimates of model parameters. This bias originates from model nonlinearity. In spite of this bias, predictions of hydraulic head are nearly unbiased if the model intrinsic nonlinearity is small. Individual confidence and prediction intervals are accurate if the t-statistic is multiplied by a correction factor. The correction factor can be computed from the true error second moment matrix, which can be determined when the stochastic properties of the system characteristics are known.

  2. Experience gained in testing a theory for modelling groundwater flow in heterogeneous media

    USGS Publications Warehouse

    Christensen, S.; Cooley, R.L.

    2002-01-01

    Usually, small-scale model error is present in groundwater modelling because the model only represents average system characteristics having the same form as the drift, and small-scale variability is neglected. These errors cause the true errors of a regression model to be correlated. Theory and an example show that the errors also contribute to bias in the estimates of model parameters. This bias originates from model nonlinearity. In spite of this bias, predictions of hydraulic head are nearly unbiased if the model intrinsic nonlinearity is small. Individual confidence and prediction intervals are accurate if the t-statistic is multiplied by a correction factor. The correction factor can be computed from the true error second moment matrix, which can be determined when the stochastic properties of the system characteristics are known.

  3. Quantum Theory of Three-Dimensional Superresolution Using Rotating-PSF Imagery

    NASA Astrophysics Data System (ADS)

    Prasad, S.; Yu, Z.

    The inverse of the quantum Fisher information (QFI) matrix (and extensions thereof) provides the ultimate lower bound on the variance of any unbiased estimation of a parameter from statistical data, whether of intrinsically quantum mechanical or classical character. We calculate the QFI for Poisson-shot-noise-limited imagery using the rotating PSF that can localize and resolve point sources fully in all three dimensions. We also propose an experimental approach based on the use of computer generated hologram and projective measurements to realize the QFI-limited variance for the problem of super-resolving a closely spaced pair of point sources at a highly reduced photon cost. The paper presents a preliminary analysis of quantum-limited three-dimensional (3D) pair optical super-resolution (OSR) problem with potential applications to astronomical imaging and 3D space-debris localization.

  4. Multifractals embedded in short time series: An unbiased estimation of probability moment

    NASA Astrophysics Data System (ADS)

    Qiu, Lu; Yang, Tianguang; Yin, Yanhua; Gu, Changgui; Yang, Huijie

    2016-12-01

    An exact estimation of probability moments is the base for several essential concepts, such as the multifractals, the Tsallis entropy, and the transfer entropy. By means of approximation theory we propose a new method called factorial-moment-based estimation of probability moments. Theoretical prediction and computational results show that it can provide us an unbiased estimation of the probability moments of continuous order. Calculations on probability redistribution model verify that it can extract exactly multifractal behaviors from several hundred recordings. Its powerfulness in monitoring evolution of scaling behaviors is exemplified by two empirical cases, i.e., the gait time series for fast, normal, and slow trials of a healthy volunteer, and the closing price series for Shanghai stock market. By using short time series with several hundred lengths, a comparison with the well-established tools displays significant advantages of its performance over the other methods. The factorial-moment-based estimation can evaluate correctly the scaling behaviors in a scale range about three generations wider than the multifractal detrended fluctuation analysis and the basic estimation. The estimation of partition function given by the wavelet transform modulus maxima has unacceptable fluctuations. Besides the scaling invariance focused in the present paper, the proposed factorial moment of continuous order can find its various uses, such as finding nonextensive behaviors of a complex system and reconstructing the causality relationship network between elements of a complex system.

  5. Evaluating probability sampling strategies for estimating redd counts: an example with Chinook salmon (Oncorhynchus tshawytscha)

    Treesearch

    Jean-Yves Courbois; Stephen L. Katz; Daniel J. Isaak; E. Ashley Steel; Russell F. Thurow; A. Michelle Wargo Rub; Tony Olsen; Chris E. Jordan

    2008-01-01

    Precise, unbiased estimates of population size are an essential tool for fisheries management. For a wide variety of salmonid fishes, redd counts from a sample of reaches are commonly used to monitor annual trends in abundance. Using a 9-year time series of georeferenced censuses of Chinook salmon (Oncorhynchus tshawytscha) redds from central Idaho,...

  6. Systematic sampling of discrete and continuous populations: sample selection and the choice of estimator

    Treesearch

    Harry T. Valentine; David L. R. Affleck; Timothy G. Gregoire

    2009-01-01

    Systematic sampling is easy, efficient, and widely used, though it is not generally recognized that a systematic sample may be drawn from the population of interest with or without restrictions on randomization. The restrictions or the lack of them determine which estimators are unbiased, when using the sampling design as the basis for inference. We describe the...

  7. The Misspecification of the Covariance Structures in Multilevel Models for Single-Case Data: A Monte Carlo Simulation Study

    ERIC Educational Resources Information Center

    Moeyaert, Mariola; Ugille, Maaike; Ferron, John M.; Beretvas, S. Natasha; Van den Noortgate, Wim

    2016-01-01

    The impact of misspecifying covariance matrices at the second and third levels of the three-level model is evaluated. Results indicate that ignoring existing covariance has no effect on the treatment effect estimate. In addition, the between-case variance estimates are unbiased when covariance is either modeled or ignored. If the research interest…

  8. Empirical Bayes Estimation of Semi-parametric Hierarchical Mixture Models for Unbiased Characterization of Polygenic Disease Architectures

    PubMed Central

    Nishino, Jo; Kochi, Yuta; Shigemizu, Daichi; Kato, Mamoru; Ikari, Katsunori; Ochi, Hidenori; Noma, Hisashi; Matsui, Kota; Morizono, Takashi; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Matsui, Shigeyuki

    2018-01-01

    Genome-wide association studies (GWAS) suggest that the genetic architecture of complex diseases consists of unexpectedly numerous variants with small effect sizes. However, the polygenic architectures of many diseases have not been well characterized due to lack of simple and fast methods for unbiased estimation of the underlying proportion of disease-associated variants and their effect-size distribution. Applying empirical Bayes estimation of semi-parametric hierarchical mixture models to GWAS summary statistics, we confirmed that schizophrenia was extremely polygenic [~40% of independent genome-wide SNPs are risk variants, most within odds ratio (OR = 1.03)], whereas rheumatoid arthritis was less polygenic (~4 to 8% risk variants, significant portion reaching OR = 1.05 to 1.1). For rheumatoid arthritis, stratified estimations revealed that expression quantitative loci in blood explained large genetic variance, and low- and high-frequency derived alleles were prone to be risk and protective, respectively, suggesting a predominance of deleterious-risk and advantageous-protective mutations. Despite genetic correlation, effect-size distributions for schizophrenia and bipolar disorder differed across allele frequency. These analyses distinguished disease polygenic architectures and provided clues for etiological differences in complex diseases. PMID:29740473

  9. Life-history traits and effective population size in species with overlapping generations revisited: the importance of adult mortality.

    PubMed

    Waples, R S

    2016-10-01

    The relationship between life-history traits and the key eco-evolutionary parameters effective population size (Ne) and Ne/N is revisited for iteroparous species with overlapping generations, with a focus on the annual rate of adult mortality (d). Analytical methods based on populations with arbitrarily long adult lifespans are used to evaluate the influence of d on Ne, Ne/N and the factors that determine these parameters: adult abundance (N), generation length (T), age at maturity (α), the ratio of variance to mean reproductive success in one season by individuals of the same age (φ) and lifetime variance in reproductive success of individuals in a cohort (Vk•). Although the resulting estimators of N, T and Vk• are upwardly biased for species with short adult lifespans, the estimate of Ne/N is largely unbiased because biases in T are compensated for by biases in Vk• and N. For the first time, the contrasting effects of T and Vk• on Ne and Ne/N are jointly considered with respect to d and φ. A simple function of d and α based on the assumption of constant vital rates is shown to be a robust predictor (R(2)=0.78) of Ne/N in an empirical data set of life tables for 63 animal and plant species with diverse life histories. Results presented here should provide important context for interpreting the surge of genetically based estimates of Ne that has been fueled by the genomics revolution.

  10. Heritability of rectal temperature and genetic correlations with production and reproduction traits in dairy cattle.

    PubMed

    Dikmen, S; Cole, J B; Null, D J; Hansen, P J

    2012-06-01

    Genetic selection for body temperature during heat stress might be a useful approach to reduce the magnitude of heat stress effects on production and reproduction. Objectives of the study were to estimate the genetic parameters of rectal temperature (RT) in dairy cows in freestall barns under heat stress conditions and to determine the genetic and phenotypic correlations of rectal temperature with other traits. Afternoon RT were measured in a total of 1,695 lactating Holstein cows sired by 509 bulls during the summer in North Florida. Genetic parameters were estimated with Gibbs sampling, and best linear unbiased predictions of breeding values were predicted using an animal model. The heritability of RT was estimated to be 0.17 ± 0.13. Predicted transmitting abilities for rectal temperature changed 0.0068 ± 0.0020°C/yr from (birth year) 2002 to 2008. Approximate genetic correlations between RT and 305-d milk, fat, and protein yields, productive life, and net merit were significant and positive, whereas approximate genetic correlations between RT and somatic cell count score and daughter pregnancy rate were significant and negative. Rectal temperature during heat stress has moderate heritability, but genetic correlations with economically important traits mean that selection for RT could lead to lower productivity unless methods are used to identify genes affecting RT that do not adversely affect other traits of economic importance. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  11. Genetic parameters and prediction of genotypic values for root quality traits in cassava using REML/BLUP.

    PubMed

    Oliveira, E J; Santana, F A; Oliveira, L A; Santos, V S

    2014-08-28

    The aim of this study was to estimate the genetic parameters and predict the genotypic values of root quality traits in cassava (Manihot esculenta Crantz) using restricted maximum likelihood (REML) and best linear unbiased prediction (BLUP). A total of 471 cassava accessions were evaluated over two years of cultivation. The evaluated traits included amylose content (AML), root dry matter (DMC), cyanogenic compounds (CyC), and starch yield (StYi). Estimates of the individual broad-sense heritability of AML were low (hg(2) = 0.07 ± 0.02), medium for StYi and DMC, and high for CyC. The heritability of AML was substantially improved based on mean of accessions (hm(2) = 0.28), indicating that some strategies such as increasing the number of repetitions can be used to increase the selective efficiency. In general, the observed genotypic values were very close to the predicted average of the improved population, most likely due to the high accuracy (>0.90), especially for DMC, CyC, and StYi. Gains via selection of the 30 best genotypes for each trait were 4.8 and 3.2% for an increase and decrease for AML, respectively, an increase of 10.75 and 74.62% for DMC for StYi, respectively, and a decrease of 89.60% for CyC in relation to the overall mean of the genotypic values. Genotypic correlations between the quality traits of the cassava roots collected were generally favorable, although they were low in magnitude. The REML/BLUP method was adequate for estimating genetic parameters and predicting the genotypic values, making it useful for cassava breeding.

  12. Moderating the Covariance Between Family Member’s Substance Use Behavior

    PubMed Central

    Eaves, Lindon J.; Neale, Michael C.

    2014-01-01

    Twin and family studies implicitly assume that the covariation between family members remains constant across differences in age between the members of the family. However, age-specificity in gene expression for shared environmental factors could generate higher correlations between family members who are more similar in age. Cohort effects (cohort × genotype or cohort × common environment) could have the same effects, and both potentially reduce effect sizes estimated in genome-wide association studies where the subjects are heterogeneous in age. In this paper we describe a model in which the covariance between twins and non-twin siblings is moderated as a function of age difference. We describe the details of the model and simulate data using a variety of different parameter values to demonstrate that model fitting returns unbiased parameter estimates. Power analyses are then conducted to estimate the sample sizes required to detect the effects of moderation in a design of twins and siblings. Finally, the model is applied to data on cigarette smoking. We find that (1) the model effectively recovers the simulated parameters, (2) the power is relatively low and therefore requires large sample sizes before small to moderate effect sizes can be found reliably, and (3) the genetic covariance between siblings for smoking behavior decays very rapidly. Result 3 implies that, e.g., genome-wide studies of smoking behavior that use individuals assessed at different ages, or belonging to different birth-year cohorts may have had substantially reduced power to detect effects of genotype on cigarette use. It also implies that significant special twin environmental effects can be explained by age-moderation in some cases. This effect likely contributes to the missing heritability paradox. PMID:24647834

  13. Simultaneous grouping pursuit and feature selection over an undirected graph*

    PubMed Central

    Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei

    2013-01-01

    Summary In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek a parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph, regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph. PMID:24098061

  14. Validation of ERS-1 environmental data products

    NASA Technical Reports Server (NTRS)

    Goodberlet, Mark A.; Swift, Calvin T.; Wilkerson, John C.

    1994-01-01

    Evaluation of the launch-version algorithms used by the European Space Agency (ESA) to derive wind field and ocean wave estimates from measurements of sensors aboard the European Remote Sensing satellite, ERS-1, has been accomplished through comparison of the derived parameters with coincident measurements made by 24 open ocean buoys maintained by the National Oceanic and Atmospheric Administration). During the period from November 1, 1991 through February 28, 1992, data bases with 577 and 485 pairs of coincident sensor/buoy wind and wave measurements were collected for the Active Microwave Instrument (AMI) and Radar Altimeter (RA) respectively. Based on these data, algorithm retrieval accuracy is estimated to be plus or minus 4 m/s for AMI wind speed, plus or minus 3 m/s for RA wind speed and plus or minus 0.6 m for RA wave height. After removing 180 degree ambiguity errors, the AMI wind direction retrieval accuracy was estimated at plus or minus 28 degrees. All of the ERS-1 wind and wave retrievals are relatively unbiased. These results should be viewed as interim since improved algorithms are under development. As final versions are implemented, additional assessments should be conducted to complete the validation.

  15. Inference about density and temporary emigration in unmarked populations

    USGS Publications Warehouse

    Chandler, Richard B.; Royle, J. Andrew; King, David I.

    2011-01-01

    Few species are distributed uniformly in space, and populations of mobile organisms are rarely closed with respect to movement, yet many models of density rely upon these assumptions. We present a hierarchical model allowing inference about the density of unmarked populations subject to temporary emigration and imperfect detection. The model can be fit to data collected using a variety of standard survey methods such as repeated point counts in which removal sampling, double-observer sampling, or distance sampling is used during each count. Simulation studies demonstrated that parameter estimators are unbiased when temporary emigration is either "completely random" or is determined by the size and location of home ranges relative to survey points. We also applied the model to repeated removal sampling data collected on Chestnut-sided Warblers (Dendroica pensylvancia) in the White Mountain National Forest, USA. The density estimate from our model, 1.09 birds/ha, was similar to an estimate of 1.11 birds/ha produced by an intensive spot-mapping effort. Our model is also applicable when processes other than temporary emigration affect the probability of being available for detection, such as in studies using cue counts. Functions to implement the model have been added to the R package unmarked.

  16. Estimating Seven Coefficients of Pairwise Relatedness Using Population-Genomic Data

    PubMed Central

    Ackerman, Matthew S.; Johri, Parul; Spitze, Ken; Xu, Sen; Doak, Thomas G.; Young, Kimberly; Lynch, Michael

    2017-01-01

    Population structure can be described by genotypic-correlation coefficients between groups of individuals, the most basic of which are the pairwise relatedness coefficients between any two individuals. There are nine pairwise relatedness coefficients in the most general model, and we show that these can be reduced to seven coefficients for biallelic loci. Although all nine coefficients can be estimated from pedigrees, six coefficients have been beyond empirical reach. We provide a numerical optimization procedure that estimates all seven reduced coefficients from population-genomic data. Simulations show that the procedure is nearly unbiased, even at 3× coverage, and errors in five of the seven coefficients are statistically uncorrelated. The remaining two coefficients have a negative correlation of errors, but their sum provides an unbiased assessment of the overall correlation of heterozygosity between two individuals. Application of these new methods to four populations of the freshwater crustacean Daphnia pulex reveal the occurrence of half siblings in our samples, as well as a number of identical individuals that are likely obligately asexual clone mates. Statistically significant negative estimates of these pairwise relatedness coefficients, including inbreeding coefficients that were typically negative, underscore the difficulties that arise when interpreting genotypic correlations as estimations of the probability that alleles are identical by descent. PMID:28341647

  17. Quality control and gap-filling of PM10 daily mean concentrations with the best linear unbiased estimator.

    PubMed

    Sozzi, R; Bolignano, A; Ceradini, S; Morelli, M; Petenko, I; Argentini, S

    2017-10-15

    According to the European Directive 2008/50/CE, the air quality assessment consists in the measurement of the concentration fields, and the evaluation of the mean, number of exceedances, etc. of some chemical species dangerous to human health. The measurements provided by an air quality ground-based monitoring network are the main information source but the availability of these data is often limited by several technical and operational problems. In this paper, the best linear unbiased estimator (BLUE) is proposed to validate the pollutant concentration values and to fill the gaps in the measurement of time series collected by a monitoring network. The BLUE algorithm is tested using the daily mean concentrations of particulate matter having aerodynamic diameter less than 10 μ (PM 10 concentrations) measured by the air quality monitoring sensors operating in the Lazio Region in Italy. The comparison between the estimated and measured data evidences an error comparable with the measurement uncertainty. Due to its simplicity and reliability, the BLUE will be used in the routine quality test procedures of the Lazio air quality monitoring network measurements.

  18. Some comments on Anderson and Pospahala's correction of bias in line transect sampling

    USGS Publications Warehouse

    Anderson, D.R.; Burnham, K.P.; Chain, B.R.

    1980-01-01

    ANDERSON and POSPAHALA (1970) investigated the estimation of wildlife population size using the belt or line transect sampling method and devised a correction for bias, thus leading to an estimator with interesting characteristics. This work was given a uniform mathematical framework in BURNHAM and ANDERSON (1976). In this paper we show that the ANDERSON-POSPAHALA estimator is optimal in the sense of being the (unique) best linear unbiased estimator within the class of estimators which are linear combinations of cell frequencies, provided certain assumptions are met.

  19. Quantitative estimation of climatic parameters from vegetation data in North America by the mutual climatic range technique

    USGS Publications Warehouse

    Anderson, Katherine H.; Bartlein, Patrick J.; Strickland, Laura E.; Pelltier, Richard T.; Thompson, Robert S.; Shafer, Sarah L.

    2012-01-01

    The mutual climatic range (MCR) technique is perhaps the most widely used method for estimating past climatic parameters from fossil assemblages, largely because it can be conducted on a simple list of the taxa present in an assemblage. When applied to plant macrofossil data, this unweighted approach (MCRun) will frequently identify a large range for a given climatic parameter where the species in an assemblage can theoretically live together. To narrow this range, we devised a new weighted approach (MCRwt) that employs information from the modern relations between climatic parameters and plant distributions to lessen the influence of the "tails" of the distributions of the climatic data associated with the taxa in an assemblage. To assess the performance of the MCR approaches, we applied them to a set of modern climatic data and plant distributions on a 25-km grid for North America, and compared observed and estimated climatic values for each grid point. In general, MCRwt was superior to MCRun in providing smaller anomalies, less bias, and better correlations between observed and estimated values. However, by the same measures, the results of Modern Analog Technique (MAT) approaches were superior to MCRwt. Although this might be reason to favor MAT approaches, they are based on assumptions that may not be valid for paleoclimatic reconstructions, including that: 1) the absence of a taxon from a fossil sample is meaningful, 2) plant associations were largely unaffected by past changes in either levels of atmospheric carbon dioxide or in the seasonal distributions of solar radiation, and 3) plant associations of the past are adequately represented on the modern landscape. To illustrate the application of these MCR and MAT approaches to paleoclimatic reconstructions, we applied them to a Pleistocene paleobotanical assemblage from the western United States. From our examinations of the estimates of modern and past climates from vegetation assemblages, we conclude that the MCRun technique provides reliable and unbiased estimates of the ranges of possible climatic conditions that can reasonably be associated with these assemblages. The application of MCRwt and MAT approaches can further constrain these estimates and may provide a systematic way to assess uncertainty. The data sets required for MCR analyses in North America are provided in a parallel publication.

  20. An Efficient Test for Gene-Environment Interaction in Generalized Linear Mixed Models with Family Data.

    PubMed

    Mazo Lopera, Mauricio A; Coombes, Brandon J; de Andrade, Mariza

    2017-09-27

    Gene-environment (GE) interaction has important implications in the etiology of complex diseases that are caused by a combination of genetic factors and environment variables. Several authors have developed GE analysis in the context of independent subjects or longitudinal data using a gene-set. In this paper, we propose to analyze GE interaction for discrete and continuous phenotypes in family studies by incorporating the relatedness among the relatives for each family into a generalized linear mixed model (GLMM) and by using a gene-based variance component test. In addition, we deal with collinearity problems arising from linkage disequilibrium among single nucleotide polymorphisms (SNPs) by considering their coefficients as random effects under the null model estimation. We show that the best linear unbiased predictor (BLUP) of such random effects in the GLMM is equivalent to the ridge regression estimator. This equivalence provides a simple method to estimate the ridge penalty parameter in comparison to other computationally-demanding estimation approaches based on cross-validation schemes. We evaluated the proposed test using simulation studies and applied it to real data from the Baependi Heart Study consisting of 76 families. Using our approach, we identified an interaction between BMI and the Peroxisome Proliferator Activated Receptor Gamma ( PPARG ) gene associated with diabetes.

  1. A Smoluchowski model of crystallization dynamics of small colloidal clusters

    NASA Astrophysics Data System (ADS)

    Beltran-Villegas, Daniel J.; Sehgal, Ray M.; Maroudas, Dimitrios; Ford, David M.; Bevan, Michael A.

    2011-10-01

    We investigate the dynamics of colloidal crystallization in a 32-particle system at a fixed value of interparticle depletion attraction that produces coexisting fluid and solid phases. Free energy landscapes (FELs) and diffusivity landscapes (DLs) are obtained as coefficients of 1D Smoluchowski equations using as order parameters either the radius of gyration or the average crystallinity. FELs and DLs are estimated by fitting the Smoluchowski equations to Brownian dynamics (BD) simulations using either linear fits to locally initiated trajectories or global fits to unbiased trajectories using Bayesian inference. The resulting FELs are compared to Monte Carlo Umbrella Sampling results. The accuracy of the FELs and DLs for modeling colloidal crystallization dynamics is evaluated by comparing mean first-passage times from BD simulations with analytical predictions using the FEL and DL models. While the 1D models accurately capture dynamics near the free energy minimum fluid and crystal configurations, predictions near the transition region are not quantitatively accurate. A preliminary investigation of ensemble averaged 2D order parameter trajectories suggests that 2D models are required to capture crystallization dynamics in the transition region.

  2. Estimation after classification using lot quality assurance sampling: corrections for curtailed sampling with application to evaluating polio vaccination campaigns.

    PubMed

    Olives, Casey; Valadez, Joseph J; Pagano, Marcello

    2014-03-01

    To assess the bias incurred when curtailment of Lot Quality Assurance Sampling (LQAS) is ignored, to present unbiased estimators, to consider the impact of cluster sampling by simulation and to apply our method to published polio immunization data from Nigeria. We present estimators of coverage when using two kinds of curtailed LQAS strategies: semicurtailed and curtailed. We study the proposed estimators with independent and clustered data using three field-tested LQAS designs for assessing polio vaccination coverage, with samples of size 60 and decision rules of 9, 21 and 33, and compare them to biased maximum likelihood estimators. Lastly, we present estimates of polio vaccination coverage from previously published data in 20 local government authorities (LGAs) from five Nigerian states. Simulations illustrate substantial bias if one ignores the curtailed sampling design. Proposed estimators show no bias. Clustering does not affect the bias of these estimators. Across simulations, standard errors show signs of inflation as clustering increases. Neither sampling strategy nor LQAS design influences estimates of polio vaccination coverage in 20 Nigerian LGAs. When coverage is low, semicurtailed LQAS strategies considerably reduces the sample size required to make a decision. Curtailed LQAS designs further reduce the sample size when coverage is high. Results presented dispel the misconception that curtailed LQAS data are unsuitable for estimation. These findings augment the utility of LQAS as a tool for monitoring vaccination efforts by demonstrating that unbiased estimation using curtailed designs is not only possible but these designs also reduce the sample size. © 2014 John Wiley & Sons Ltd.

  3. Experimental and computational correlation of fracture parameters KIc, JIc, and GIc for unimodular and bimodular graphite components

    NASA Astrophysics Data System (ADS)

    Bhushan, Awani; Panda, S. K.

    2018-05-01

    The influence of bimodularity (different stress ∼ strain behaviour in tension and compression) on fracture behaviour of graphite specimens has been studied with fracture toughness (KIc), critical J-integral (JIc) and critical strain energy release rate (GIc) as the characterizing parameter. Bimodularity index (ratio of tensile Young's modulus to compression Young's modulus) of graphite specimens has been obtained from the normalized test data of tensile and compression experimentation. Single edge notch bend (SENB) testing of pre-cracked specimens from the same lot have been carried out as per ASTM standard D7779-11 to determine the peak load and critical fracture parameters KIc, GIc and JIc using digital image correlation technology of crack opening displacements. Weibull weakest link theory has been used to evaluate the mean peak load, Weibull modulus and goodness of fit employing two parameter least square method (LIN2), biased (MLE2-B) and unbiased (MLE2-U) maximum likelihood estimator. The stress dependent elasticity problem of three-dimensional crack progression behaviour for the bimodular graphite components has been solved as an iterative finite element procedure. The crack characterizing parameters critical stress intensity factor and critical strain energy release rate have been estimated with the help of Weibull distribution plot between peak loads versus cumulative probability of failure. Experimental and Computational fracture parameters have been compared qualitatively to describe the significance of bimodularity. The bimodular influence on fracture behaviour of SENB graphite has been reflected on the experimental evaluation of GIc values only, which has been found to be different from the calculated JIc values. Numerical evaluation of bimodular 3D J-integral value is found to be close to the GIc value whereas the unimodular 3D J-value is nearer to the JIc value. The significant difference between the unimodular JIc and bimodular GIc indicates that GIc should be considered as the standard fracture parameter for bimodular brittle specimens.

  4. Some New Results on Grubbs’ Estimators.

    DTIC Science & Technology

    1983-06-01

    8217 ESTIMATORS DENNIS A. BRINDLEY AND RALPH A. BRADLEY* Consider a two-way classification with n rows and r columns and the usual model of analysis of variance...except that the error components of the model may have heterogeneous variances, by columns. -Grubbs provided unbiased estimators Q. of a . that depend...of observations yij, i = 1, ... , n, j 1, ... , r, and the model , Yij = Ili + ij + Ej, (1) when Vi represents the mean response of row i, . represents

  5. Suitability of the line intersect method for sampling hardwood logging residues

    Treesearch

    A. Jeff Martin

    1976-01-01

    The line intersect method of sampling logging residues was tested in Appalachian hardwoods and was found to provide unbiased estimates of the volume of residue in cubic feet per acre. Thirty-two chains of sample line were established on each of sixteen 1-acre plots on cutover areas in a variety of conditions. Estimates from these samples were then compared to actual...

  6. A method to deconvolve stellar rotational velocities II. The probability distribution function via Tikhonov regularization

    NASA Astrophysics Data System (ADS)

    Christen, Alejandra; Escarate, Pedro; Curé, Michel; Rial, Diego F.; Cassetti, Julia

    2016-10-01

    Aims: Knowing the distribution of stellar rotational velocities is essential for understanding stellar evolution. Because we measure the projected rotational speed v sin I, we need to solve an ill-posed problem given by a Fredholm integral of the first kind to recover the "true" rotational velocity distribution. Methods: After discretization of the Fredholm integral we apply the Tikhonov regularization method to obtain directly the probability distribution function for stellar rotational velocities. We propose a simple and straightforward procedure to determine the Tikhonov parameter. We applied Monte Carlo simulations to prove that the Tikhonov method is a consistent estimator and asymptotically unbiased. Results: This method is applied to a sample of cluster stars. We obtain confidence intervals using a bootstrap method. Our results are in close agreement with those obtained using the Lucy method for recovering the probability density distribution of rotational velocities. Furthermore, Lucy estimation lies inside our confidence interval. Conclusions: Tikhonov regularization is a highly robust method that deconvolves the rotational velocity probability density function from a sample of v sin I data directly without the need for any convergence criteria.

  7. Retention of riveted aluminum leg bands by wild turkeys

    USGS Publications Warehouse

    Diefenbach, Duane R.; Vreeland, Wendy C.; Casalena, Mary Jo; Schiavone, Michael V.

    2016-01-01

    In order for mark–recapture models to provide unbiased estimates of population parameters, it is critical that uniquely identifying tags or marks are not lost. We double-banded male and female wild turkeys with aluminum rivet bands and estimated the probability that a bird would be recovered with both bands <1–225 wk since banding (mean = 51.2 wk, SD = 44.0). We found that 100% of females (n = 37) were recovered with both bands. For males, we recovered 6 of 188 turkeys missing a rivet band for a retention probability of 0.984 (95% CI = 0.96–0.99). If male turkeys are double-banded with rivet bands the probability of recovering a turkey without any marks is <0.001. We failed to detect a change in band retention over time or differences between adults and juveniles. Given the low cost and high retention rates of rivet aluminum bands, we believe they are an effective marking technique for wild turkeys and, for most studies, will minimize any concern about the assumption that marks are not lost.

  8. A flexible cure rate model with dependent censoring and a known cure threshold.

    PubMed

    Bernhardt, Paul W

    2016-11-10

    We propose a flexible cure rate model that accommodates different censoring distributions for the cured and uncured groups and also allows for some individuals to be observed as cured when their survival time exceeds a known threshold. We model the survival times for the uncured group using an accelerated failure time model with errors distributed according to the seminonparametric distribution, potentially truncated at a known threshold. We suggest a straightforward extension of the usual expectation-maximization algorithm approach for obtaining estimates in cure rate models to accommodate the cure threshold and dependent censoring. We additionally suggest a likelihood ratio test for testing for the presence of dependent censoring in the proposed cure rate model. We show through numerical studies that our model has desirable properties and leads to approximately unbiased parameter estimates in a variety of scenarios. To demonstrate how our method performs in practice, we analyze data from a bone marrow transplantation study and a liver transplant study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  9. Spectral X-ray Radiography for Safeguards at Nuclear Fuel Fabrication Facilities: A Feasibility Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbert, Andrew J.; McDonald, Benjamin S.; Smith, Leon E.

    The methods currently used by the International Atomic Energy Agency to account for nuclear materials at fuel fabrication facilities are time consuming and require in-field chemistry and operation by experts. Spectral X-ray radiography, along with advanced inverse algorithms, is an alternative inspection that could be completed noninvasively, without any in-field chemistry, with inspections of tens of seconds. The proposed inspection system and algorithms are presented here. The inverse algorithm uses total variation regularization and adaptive regularization parameter selection with the unbiased predictive risk estimator. Performance of the system is quantified with simulated X-ray inspection data and sensitivity of the outputmore » is tested against various inspection system instabilities. Material quantification from a fully-characterized inspection system is shown to be very accurate, with biases on nuclear material estimations of < 0.02%. It is shown that the results are sensitive to variations in the fuel powder sample density and detector pixel gain, which increase biases to 1%. Options to mitigate these inaccuracies are discussed.« less

  10. Sampling considerations for disease surveillance in wildlife populations

    USGS Publications Warehouse

    Nusser, S.M.; Clark, W.R.; Otis, D.L.; Huang, L.

    2008-01-01

    Disease surveillance in wildlife populations involves detecting the presence of a disease, characterizing its prevalence and spread, and subsequent monitoring. A probability sample of animals selected from the population and corresponding estimators of disease prevalence and detection provide estimates with quantifiable statistical properties, but this approach is rarely used. Although wildlife scientists often assume probability sampling and random disease distributions to calculate sample sizes, convenience samples (i.e., samples of readily available animals) are typically used, and disease distributions are rarely random. We demonstrate how landscape-based simulation can be used to explore properties of estimators from convenience samples in relation to probability samples. We used simulation methods to model what is known about the habitat preferences of the wildlife population, the disease distribution, and the potential biases of the convenience-sample approach. Using chronic wasting disease in free-ranging deer (Odocoileus virginianus) as a simple illustration, we show that using probability sample designs with appropriate estimators provides unbiased surveillance parameter estimates but that the selection bias and coverage errors associated with convenience samples can lead to biased and misleading results. We also suggest practical alternatives to convenience samples that mix probability and convenience sampling. For example, a sample of land areas can be selected using a probability design that oversamples areas with larger animal populations, followed by harvesting of individual animals within sampled areas using a convenience sampling method.

  11. Reliability estimation of a N- M-cold-standby redundancy system in a multicomponent stress-strength model with generalized half-logistic distribution

    NASA Astrophysics Data System (ADS)

    Liu, Yiming; Shi, Yimin; Bai, Xuchao; Zhan, Pei

    2018-01-01

    In this paper, we study the estimation for the reliability of a multicomponent system, named N- M-cold-standby redundancy system, based on progressive Type-II censoring sample. In the system, there are N subsystems consisting of M statistically independent distributed strength components, and only one of these subsystems works under the impact of stresses at a time and the others remain as standbys. Whenever the working subsystem fails, one from the standbys takes its place. The system fails when the entire subsystems fail. It is supposed that the underlying distributions of random strength and stress both belong to the generalized half-logistic distribution with different shape parameter. The reliability of the system is estimated by using both classical and Bayesian statistical inference. Uniformly minimum variance unbiased estimator and maximum likelihood estimator for the reliability of the system are derived. Under squared error loss function, the exact expression of the Bayes estimator for the reliability of the system is developed by using the Gauss hypergeometric function. The asymptotic confidence interval and corresponding coverage probabilities are derived based on both the Fisher and the observed information matrices. The approximate highest probability density credible interval is constructed by using Monte Carlo method. Monte Carlo simulations are performed to compare the performances of the proposed reliability estimators. A real data set is also analyzed for an illustration of the findings.

  12. A cross-correlation-based estimate of the galaxy luminosity function

    NASA Astrophysics Data System (ADS)

    van Daalen, Marcel P.; White, Martin

    2018-06-01

    We extend existing methods for using cross-correlations to derive redshift distributions for photometric galaxies, without using photometric redshifts. The model presented in this paper simultaneously yields highly accurate and unbiased redshift distributions and, for the first time, redshift-dependent luminosity functions, using only clustering information and the apparent magnitudes of the galaxies as input. In contrast to many existing techniques for recovering unbiased redshift distributions, the output of our method is not degenerate with the galaxy bias b(z), which is achieved by modelling the shape of the luminosity bias. We successfully apply our method to a mock galaxy survey and discuss improvements to be made before applying our model to real data.

  13. ESTIMATING TREATMENT EFFECTS ON HEALTHCARE COSTS UNDER EXOGENEITY: IS THERE A ‘MAGIC BULLET’?

    PubMed Central

    Polsky, Daniel; Manning, Willard G.

    2011-01-01

    Methods for estimating average treatment effects, under the assumption of no unmeasured confounders, include regression models; propensity score adjustments using stratification, weighting, or matching; and doubly robust estimators (a combination of both). Researchers continue to debate about the best estimator for outcomes such as health care cost data, as they are usually characterized by an asymmetric distribution and heterogeneous treatment effects,. Challenges in finding the right specifications for regression models are well documented in the literature. Propensity score estimators are proposed as alternatives to overcoming these challenges. Using simulations, we find that in moderate size samples (n= 5000), balancing on propensity scores that are estimated from saturated specifications can balance the covariate means across treatment arms but fails to balance higher-order moments and covariances amongst covariates. Therefore, unlike regression model, even if a formal model for outcomes is not required, propensity score estimators can be inefficient at best and biased at worst for health care cost data. Our simulation study, designed to take a ‘proof by contradiction’ approach, proves that no one estimator can be considered the best under all data generating processes for outcomes such as costs. The inverse-propensity weighted estimator is most likely to be unbiased under alternate data generating processes but is prone to bias under misspecification of the propensity score model and is inefficient compared to an unbiased regression estimator. Our results show that there are no ‘magic bullets’ when it comes to estimating treatment effects in health care costs. Care should be taken before naively applying any one estimator to estimate average treatment effects in these data. We illustrate the performance of alternative methods in a cost dataset on breast cancer treatment. PMID:22199462

  14. Distinguishing Antimicrobial Models with Different Resistance Mechanisms via Population Pharmacodynamic Modeling

    PubMed Central

    Jacobs, Matthieu; Grégoire, Nicolas; Couet, William; Bulitta, Jurgen B.

    2016-01-01

    Semi-mechanistic pharmacokinetic-pharmacodynamic (PK-PD) modeling is increasingly used for antimicrobial drug development and optimization of dosage regimens, but systematic simulation-estimation studies to distinguish between competing PD models are lacking. This study compared the ability of static and dynamic in vitro infection models to distinguish between models with different resistance mechanisms and support accurate and precise parameter estimation. Monte Carlo simulations (MCS) were performed for models with one susceptible bacterial population without (M1) or with a resting stage (M2), a one population model with adaptive resistance (M5), models with pre-existing susceptible and resistant populations without (M3) or with (M4) inter-conversion, and a model with two pre-existing populations with adaptive resistance (M6). For each model, 200 datasets of the total bacterial population were simulated over 24h using static antibiotic concentrations (256-fold concentration range) or over 48h under dynamic conditions (dosing every 12h; elimination half-life: 1h). Twelve-hundred random datasets (each containing 20 curves for static or four curves for dynamic conditions) were generated by bootstrapping. Each dataset was estimated by all six models via population PD modeling to compare bias and precision. For M1 and M3, most parameter estimates were unbiased (<10%) and had good imprecision (<30%). However, parameters for adaptive resistance and inter-conversion for M2, M4, M5 and M6 had poor bias and large imprecision under static and dynamic conditions. For datasets that only contained viable counts of the total population, common statistical criteria and diagnostic plots did not support sound identification of the true resistance mechanism. Therefore, it seems advisable to quantify resistant bacteria and characterize their MICs and resistance mechanisms to support extended simulations and translate from in vitro experiments to animal infection models and ultimately patients. PMID:26967893

  15. Blinded sample size re-estimation in three-arm trials with 'gold standard' design.

    PubMed

    Mütze, Tobias; Friede, Tim

    2017-10-15

    In this article, we study blinded sample size re-estimation in the 'gold standard' design with internal pilot study for normally distributed outcomes. The 'gold standard' design is a three-arm clinical trial design that includes an active and a placebo control in addition to an experimental treatment. We focus on the absolute margin approach to hypothesis testing in three-arm trials at which the non-inferiority of the experimental treatment and the assay sensitivity are assessed by pairwise comparisons. We compare several blinded sample size re-estimation procedures in a simulation study assessing operating characteristics including power and type I error. We find that sample size re-estimation based on the popular one-sample variance estimator results in overpowered trials. Moreover, sample size re-estimation based on unbiased variance estimators such as the Xing-Ganju variance estimator results in underpowered trials, as it is expected because an overestimation of the variance and thus the sample size is in general required for the re-estimation procedure to eventually meet the target power. To overcome this problem, we propose an inflation factor for the sample size re-estimation with the Xing-Ganju variance estimator and show that this approach results in adequately powered trials. Because of favorable features of the Xing-Ganju variance estimator such as unbiasedness and a distribution independent of the group means, the inflation factor does not depend on the nuisance parameter and, therefore, can be calculated prior to a trial. Moreover, we prove that the sample size re-estimation based on the Xing-Ganju variance estimator does not bias the effect estimate. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  16. Software for the grouped optimal aggregation technique

    NASA Technical Reports Server (NTRS)

    Brown, P. M.; Shaw, G. W. (Principal Investigator)

    1982-01-01

    The grouped optimal aggregation technique produces minimum variance, unbiased estimates of acreage and production for countries, zones (states), or any designated collection of acreage strata. It uses yield predictions, historical acreage information, and direct acreage estimate from satellite data. The acreage strata are grouped in such a way that the ratio model over historical acreage provides a smaller variance than if the model were applied to each individual stratum. An optimal weighting matrix based on historical acreages, provides the link between incomplete direct acreage estimates and the total, current acreage estimate.

  17. A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution.

    PubMed

    Harrison, Xavier A

    2015-01-01

    Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained <5 levels of the random intercept term, especially for estimating variance components, and this effect appeared independent of total sample size. These results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.

  18. Clear: Composition of Likelihoods for Evolve and Resequence Experiments.

    PubMed

    Iranmehr, Arya; Akbari, Ali; Schlötterer, Christian; Bafna, Vineet

    2017-06-01

    The advent of next generation sequencing technologies has made whole-genome and whole-population sampling possible, even for eukaryotes with large genomes. With this development, experimental evolution studies can be designed to observe molecular evolution "in action" via evolve-and-resequence (E&R) experiments. Among other applications, E&R studies can be used to locate the genes and variants responsible for genetic adaptation. Most existing literature on time-series data analysis often assumes large population size, accurate allele frequency estimates, or wide time spans. These assumptions do not hold in many E&R studies. In this article, we propose a method-composition of likelihoods for evolve-and-resequence experiments (Clear)-to identify signatures of selection in small population E&R experiments. Clear takes whole-genome sequences of pools of individuals as input, and properly addresses heterogeneous ascertainment bias resulting from uneven coverage. Clear also provides unbiased estimates of model parameters, including population size, selection strength, and dominance, while being computationally efficient. Extensive simulations show that Clear achieves higher power in detecting and localizing selection over a wide range of parameters, and is robust to variation of coverage. We applied the Clear statistic to multiple E&R experiments, including data from a study of adaptation of Drosophila melanogaster to alternating temperatures and a study of outcrossing yeast populations, and identified multiple regions under selection with genome-wide significance. Copyright © 2017 by the Genetics Society of America.

  19. Estimations of ABL fluxes and other turbulence parameters from Doppler lidar data

    NASA Technical Reports Server (NTRS)

    Gal-Chen, Tzvi; Xu, Mei; Eberhard, Wynn

    1989-01-01

    Techniques for extraction boundary layer parameters from measurements of a short-pulse CO2 Doppler lidar are described. The measurements are those collected during the First International Satellites Land Surface Climatology Project (ISLSCP) Field Experiment (FIFE). By continuously operating the lidar for about an hour, stable statistics of the radial velocities can be extracted. Assuming that the turbulence is horizontally homogeneous, the mean wind, its standard deviations, and the momentum fluxes were estimated. Spectral analysis of the radial velocities is also performed from which, by examining the amplitude of the power spectrum at the inertial range, the kinetic energy dissipation was deduced. Finally, using the statistical form of the Navier-Stokes equations, the surface heat flux is derived as the residual balance between the vertical gradient of the third moment of the vertical velocity and the kinetic energy dissipation. Combining many measurements would normally reduce the error provided that, it is unbiased and uncorrelated. The nature of some of the algorithms however, is such that, biased and correlated errors may be generated even though the raw measurements are not. Data processing procedures were developed that eliminate bias and minimize error correlation. Once bias and error correlations are accounted for, the large sample size is shown to reduce the errors substantially. The principal features of the derived turbulence statistics for two case studied are presented.

  20. A hidden-process model for estimating prespawn mortality using carcass survey data

    USGS Publications Warehouse

    DeWeber, J. Tyrell; Peterson, James T.; Sharpe, Cameron; Kent, Michael L.; Colvin, Michael E.; Schreck, Carl B.

    2017-01-01

    After returning to spawning areas, adult Pacific salmon Oncorhynchus spp. often die without spawning successfully, which is commonly referred to as prespawn mortality. Prespawn mortality reduces reproductive success and can thereby hamper conservation, restoration, and reintroduction efforts. The primary source of information used to estimate prespawn mortality is collected through carcass surveys, but estimation can be difficult with these data due to imperfect detection and carcasses with unknown spawning status. To facilitate unbiased estimation of prespawn mortality and associated uncertainty, we developed a hidden-process mark–recovery model to estimate prespawn mortality rates from carcass survey data while accounting for imperfect detection and unknown spawning success. We then used the model to estimate prespawn mortality and identify potential associated factors for 3,352 adult spring Chinook Salmon O. tshawytscha that were transported above Foster Dam on the South Santiam River (Willamette River basin, Oregon) from 2009 to 2013. Estimated prespawn mortality was relatively low (≤13%) in most years (interannual mean = 28%) but was especially high (74%) in 2013. Variation in prespawn mortality estimates among outplanted groups of fish within each year was also very high, and some of this variation was explained by a trend toward lower prespawn mortality among fish that were outplanted later in the year. Numerous efforts are being made to monitor and, when possible, minimize prespawn mortality in salmon populations; this model can be used to provide unbiased estimates of spawning success that account for unknown fate and imperfect detection, which are common to carcass survey data.

  1. One-shot estimate of MRMC variance: AUC.

    PubMed

    Gallas, Brandon D

    2006-03-01

    One popular study design for estimating the area under the receiver operating characteristic curve (AUC) is the one in which a set of readers reads a set of cases: a fully crossed design in which every reader reads every case. The variability of the subsequent reader-averaged AUC has two sources: the multiple readers and the multiple cases (MRMC). In this article, we present a nonparametric estimate for the variance of the reader-averaged AUC that is unbiased and does not use resampling tools. The one-shot estimate is based on the MRMC variance derived by the mechanistic approach of Barrett et al. (2005), as well as the nonparametric variance of a single-reader AUC derived in the literature on U statistics. We investigate the bias and variance properties of the one-shot estimate through a set of Monte Carlo simulations with simulated model observers and images. The different simulation configurations vary numbers of readers and cases, amounts of image noise and internal noise, as well as how the readers are constructed. We compare the one-shot estimate to a method that uses the jackknife resampling technique with an analysis of variance model at its foundation (Dorfman et al. 1992). The name one-shot highlights that resampling is not used. The one-shot and jackknife estimators behave similarly, with the one-shot being marginally more efficient when the number of cases is small. We have derived a one-shot estimate of the MRMC variance of AUC that is based on a probabilistic foundation with limited assumptions, is unbiased, and compares favorably to an established estimate.

  2. Identification of a parametric, discrete-time model of ankle stiffness.

    PubMed

    Guarin, Diego L; Jalaleddini, Kian; Kearney, Robert E

    2013-01-01

    Dynamic ankle joint stiffness defines the relationship between the position of the ankle and the torque acting about it and can be separated into intrinsic and reflex components. Under stationary conditions, intrinsic stiffness can described by a linear second order system while reflex stiffness is described by Hammerstein system whose input is delayed velocity. Given that reflex and intrinsic torque cannot be measured separately, there has been much interest in the development of system identification techniques to separate them analytically. To date, most methods have been nonparametric and as a result there is no direct link between the estimated parameters and those of the stiffness model. This paper presents a novel algorithm for identification of a discrete-time model of ankle stiffness. Through simulations we show that the algorithm gives unbiased results even in the presence of large, non-white noise. Application of the method to experimental data demonstrates that it produces results consistent with previous findings.

  3. Generation and evaluation of an ultra-high-field atlas with applications in DBS planning

    NASA Astrophysics Data System (ADS)

    Wang, Brian T.; Poirier, Stefan; Guo, Ting; Parrent, Andrew G.; Peters, Terry M.; Khan, Ali R.

    2016-03-01

    Purpose Deep brain stimulation (DBS) is a common treatment for Parkinson's disease (PD) and involves the use of brain atlases or intrinsic landmarks to estimate the location of target deep brain structures, such as the subthalamic nucleus (STN) and the globus pallidus pars interna (GPi). However, these structures can be difficult to localize with conventional clinical magnetic resonance imaging (MRI), and thus targeting can be prone to error. Ultra-high-field imaging at 7T has the ability to clearly resolve these structures and thus atlases built with these data have the potential to improve targeting accuracy. Methods T1 and T2-weighted images of 12 healthy control subjects were acquired using a 7T MR scanner. These images were then used with groupwise registration to generate an unbiased average template with T1w and T2w contrast. Deep brain structures were manually labelled in each subject by two raters and rater reliability was assessed. We compared the use of this unbiased atlas with two other methods of atlas-based segmentation (single-template and multi-template) for subthalamic nucleus (STN) segmentation on 7T MRI data. We also applied this atlas to clinical DBS data acquired at 1.5T to evaluate its efficacy for DBS target localization as compared to using a standard atlas. Results The unbiased templates provide superb detail of subcortical structures. Through one-way ANOVA tests, the unbiased template is significantly (p <0.05) more accurate than a single-template in atlas-based segmentation and DBS target localization tasks. Conclusion The generated unbiased averaged templates provide better visualization of deep brain nuclei and an increase in accuracy over single-template and lower field strength atlases.

  4. Unbiased reduced density matrices and electronic properties from full configuration interaction quantum Monte Carlo.

    PubMed

    Overy, Catherine; Booth, George H; Blunt, N S; Shepherd, James J; Cleland, Deidre; Alavi, Ali

    2014-12-28

    Properties that are necessarily formulated within pure (symmetric) expectation values are difficult to calculate for projector quantum Monte Carlo approaches, but are critical in order to compute many of the important observable properties of electronic systems. Here, we investigate an approach for the sampling of unbiased reduced density matrices within the full configuration interaction quantum Monte Carlo dynamic, which requires only small computational overheads. This is achieved via an independent replica population of walkers in the dynamic, sampled alongside the original population. The resulting reduced density matrices are free from systematic error (beyond those present via constraints on the dynamic itself) and can be used to compute a variety of expectation values and properties, with rapid convergence to an exact limit. A quasi-variational energy estimate derived from these density matrices is proposed as an accurate alternative to the projected estimator for multiconfigurational wavefunctions, while its variational property could potentially lend itself to accurate extrapolation approaches in larger systems.

  5. Efficient Stochastic Rendering of Static and Animated Volumes Using Visibility Sweeps.

    PubMed

    von Radziewsky, Philipp; Kroes, Thomas; Eisemann, Martin; Eisemann, Elmar

    2017-09-01

    Stochastically solving the rendering integral (particularly visibility) is the de-facto standard for physically-based light transport but it is computationally expensive, especially when displaying heterogeneous volumetric data. In this work, we present efficient techniques to speed-up the rendering process via a novel visibility-estimation method in concert with an unbiased importance sampling (involving environmental lighting and visibility inside the volume), filtering, and update techniques for both static and animated scenes. Our major contributions include a progressive estimate of partial occlusions based on a fast sweeping-plane algorithm. These occlusions are stored in an octahedral representation, which can be conveniently transformed into a quadtree-based hierarchy suited for a joint importance sampling. Further, we propose sweep-space filtering, which suppresses the occurrence of fireflies and investigate different update schemes for animated scenes. Our technique is unbiased, requires little precomputation, is highly parallelizable, and is applicable to a various volume data sets, dynamic transfer functions, animated volumes and changing environmental lighting.

  6. Unbiased reduced density matrices and electronic properties from full configuration interaction quantum Monte Carlo

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Overy, Catherine; Blunt, N. S.; Shepherd, James J.

    2014-12-28

    Properties that are necessarily formulated within pure (symmetric) expectation values are difficult to calculate for projector quantum Monte Carlo approaches, but are critical in order to compute many of the important observable properties of electronic systems. Here, we investigate an approach for the sampling of unbiased reduced density matrices within the full configuration interaction quantum Monte Carlo dynamic, which requires only small computational overheads. This is achieved via an independent replica population of walkers in the dynamic, sampled alongside the original population. The resulting reduced density matrices are free from systematic error (beyond those present via constraints on the dynamicmore » itself) and can be used to compute a variety of expectation values and properties, with rapid convergence to an exact limit. A quasi-variational energy estimate derived from these density matrices is proposed as an accurate alternative to the projected estimator for multiconfigurational wavefunctions, while its variational property could potentially lend itself to accurate extrapolation approaches in larger systems.« less

  7. Statistical properties of the normalized ice particle size distribution

    NASA Astrophysics Data System (ADS)

    Delanoë, Julien; Protat, Alain; Testud, Jacques; Bouniol, Dominique; Heymsfield, A. J.; Bansemer, A.; Brown, P. R. A.; Forbes, R. M.

    2005-05-01

    Testud et al. (2001) have recently developed a formalism, known as the "normalized particle size distribution (PSD)", which consists in scaling the diameter and concentration axes in such a way that the normalized PSDs are independent of water content and mean volume-weighted diameter. In this paper we investigate the statistical properties of the normalized PSD for the particular case of ice clouds, which are known to play a crucial role in the Earth's radiation balance. To do so, an extensive database of airborne in situ microphysical measurements has been constructed. A remarkable stability in shape of the normalized PSD is obtained. The impact of using a single analytical shape to represent all PSDs in the database is estimated through an error analysis on the instrumental (radar reflectivity and attenuation) and cloud (ice water content, effective radius, terminal fall velocity of ice crystals, visible extinction) properties. This resulted in a roughly unbiased estimate of the instrumental and cloud parameters, with small standard deviations ranging from 5 to 12%. This error is found to be roughly independent of the temperature range. This stability in shape and its single analytical approximation implies that two parameters are now sufficient to describe any normalized PSD in ice clouds: the intercept parameter N*0 and the mean volume-weighted diameter Dm. Statistical relationships (parameterizations) between N*0 and Dm have then been evaluated in order to reduce again the number of unknowns. It has been shown that a parameterization of N*0 and Dm by temperature could not be envisaged to retrieve the cloud parameters. Nevertheless, Dm-T and mean maximum dimension diameter -T parameterizations have been derived and compared to the parameterization of Kristjánsson et al. (2000) currently used to characterize particle size in climate models. The new parameterization generally produces larger particle sizes at any temperature than the Kristjánsson et al. (2000) parameterization. These new parameterizations are believed to better represent particle size at global scale, owing to a better representativity of the in situ microphysical database used to derive it. We then evaluated the potential of a direct N*0-Dm relationship. While the model parameterized by temperature produces strong errors on the cloud parameters, the N*0-Dm model parameterized by radar reflectivity produces accurate cloud parameters (less than 3% bias and 16% standard deviation). This result implies that the cloud parameters can be estimated from the estimate of only one parameter of the normalized PSD (N*0 or Dm) and a radar reflectivity measurement.

  8. Missing data and multiple imputation in clinical epidemiological research.

    PubMed

    Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

    2017-01-01

    Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data.

  9. Missing data and multiple imputation in clinical epidemiological research

    PubMed Central

    Pedersen, Alma B; Mikkelsen, Ellen M; Cronin-Fenton, Deirdre; Kristensen, Nickolaj R; Pham, Tra My; Pedersen, Lars; Petersen, Irene

    2017-01-01

    Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data. PMID:28352203

  10. Scanning linear estimation: improvements over region of interest (ROI) methods

    NASA Astrophysics Data System (ADS)

    Kupinski, Meredith K.; Clarkson, Eric W.; Barrett, Harrison H.

    2013-03-01

    In tomographic medical imaging, a signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is unbiased, i.e. the average estimate equals the true value. By contrast, unpredictable bias arising from the null functions of the imaging system affect standard algorithms that operate on reconstructed data. The SL method is demonstrated for two different tasks: (1) simultaneously estimating a signal’s size, location and activity; (2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution small-animal SPECT imaging system. For both tasks, the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and maximum values within the region of interest (ROI) are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor the response to therapy, the activity estimation task is repeated for three different signal sizes.

  11. Assessing mediation using marginal structural models in the presence of confounding and moderation

    PubMed Central

    Coffman, Donna L.; Zhong, Wei

    2012-01-01

    This paper presents marginal structural models (MSMs) with inverse propensity weighting (IPW) for assessing mediation. Generally, individuals are not randomly assigned to levels of the mediator. Therefore, confounders of the mediator and outcome may exist that limit causal inferences, a goal of mediation analysis. Either regression adjustment or IPW can be used to take confounding into account, but IPW has several advantages. Regression adjustment of even one confounder of the mediator and outcome that has been influenced by treatment results in biased estimates of the direct effect (i.e., the effect of treatment on the outcome that does not go through the mediator). One advantage of IPW is that it can properly adjust for this type of confounding, assuming there are no unmeasured confounders. Further, we illustrate that IPW estimation provides unbiased estimates of all effects when there is a baseline moderator variable that interacts with the treatment, when there is a baseline moderator variable that interacts with the mediator, and when the treatment interacts with the mediator. IPW estimation also provides unbiased estimates of all effects in the presence of non-randomized treatments. In addition, for testing mediation we propose a test of the null hypothesis of no mediation. Finally, we illustrate this approach with an empirical data set in which the mediator is continuous, as is often the case in psychological research. PMID:22905648

  12. Assessing mediation using marginal structural models in the presence of confounding and moderation.

    PubMed

    Coffman, Donna L; Zhong, Wei

    2012-12-01

    This article presents marginal structural models with inverse propensity weighting (IPW) for assessing mediation. Generally, individuals are not randomly assigned to levels of the mediator. Therefore, confounders of the mediator and outcome may exist that limit causal inferences, a goal of mediation analysis. Either regression adjustment or IPW can be used to take confounding into account, but IPW has several advantages. Regression adjustment of even one confounder of the mediator and outcome that has been influenced by treatment results in biased estimates of the direct effect (i.e., the effect of treatment on the outcome that does not go through the mediator). One advantage of IPW is that it can properly adjust for this type of confounding, assuming there are no unmeasured confounders. Further, we illustrate that IPW estimation provides unbiased estimates of all effects when there is a baseline moderator variable that interacts with the treatment, when there is a baseline moderator variable that interacts with the mediator, and when the treatment interacts with the mediator. IPW estimation also provides unbiased estimates of all effects in the presence of nonrandomized treatments. In addition, for testing mediation we propose a test of the null hypothesis of no mediation. Finally, we illustrate this approach with an empirical data set in which the mediator is continuous, as is often the case in psychological research. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  13. Two-stage sequential sampling: A neighborhood-free adaptive sampling procedure

    USGS Publications Warehouse

    Salehi, M.; Smith, D.R.

    2005-01-01

    Designing an efficient sampling scheme for a rare and clustered population is a challenging area of research. Adaptive cluster sampling, which has been shown to be viable for such a population, is based on sampling a neighborhood of units around a unit that meets a specified condition. However, the edge units produced by sampling neighborhoods have proven to limit the efficiency and applicability of adaptive cluster sampling. We propose a sampling design that is adaptive in the sense that the final sample depends on observed values, but it avoids the use of neighborhoods and the sampling of edge units. Unbiased estimators of population total and its variance are derived using Murthy's estimator. The modified two-stage sampling design is easy to implement and can be applied to a wider range of populations than adaptive cluster sampling. We evaluate the proposed sampling design by simulating sampling of two real biological populations and an artificial population for which the variable of interest took the value either 0 or 1 (e.g., indicating presence and absence of a rare event). We show that the proposed sampling design is more efficient than conventional sampling in nearly all cases. The approach used to derive estimators (Murthy's estimator) opens the door for unbiased estimators to be found for similar sequential sampling designs. ?? 2005 American Statistical Association and the International Biometric Society.

  14. Evaluation of Statistical Methodologies Used in U. S. Army Ordnance and Explosive Work

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ostrouchov, G

    2000-02-14

    Oak Ridge National Laboratory was tasked by the U.S. Army Engineering and Support Center (Huntsville, AL) to evaluate the mathematical basis of existing software tools used to assist the Army with the characterization of sites potentially contaminated with unexploded ordnance (UXO). These software tools are collectively known as SiteStats/GridStats. The first purpose of the software is to guide sampling of underground anomalies to estimate a site's UXO density. The second purpose is to delineate areas of homogeneous UXO density that can be used in the formulation of response actions. It was found that SiteStats/GridStats does adequately guide the sampling somore » that the UXO density estimator for a sector is unbiased. However, the software's techniques for delineation of homogeneous areas perform less well than visual inspection, which is frequently used to override the software in the overall sectorization methodology. The main problems with the software lie in the criteria used to detect nonhomogeneity and those used to recommend the number of homogeneous subareas. SiteStats/GridStats is not a decision-making tool in the classical sense. Although it does provide information to decision makers, it does not require a decision based on that information. SiteStats/GridStats provides information that is supplemented by visual inspections, land-use plans, and risk estimates prior to making any decisions. Although the sector UXO density estimator is unbiased regardless of UXO density variation within a sector, its variability increases with increased sector density variation. For this reason, the current practice of visual inspection of individual sampled grid densities (as provided by Site-Stats/GridStats) is necessary to ensure approximate homogeneity, particularly at sites with medium to high UXO density. Together with Site-Stats/GridStats override capabilities, this provides a sufficient mechanism for homogeneous sectorization and thus yields representative UXO density estimates. Objections raised by various parties to the use of a numerical ''discriminator'' in SiteStats/GridStats were likely because of the fact that the concerned statistical technique is customarily applied for a different purpose and because of poor documentation. The ''discriminator'', in Site-Stats/GridStats is a ''tuning parameter'' for the sampling process, and it affects the precision of the grid density estimates through changes in required sample size. It is recommended that sector characterization in terms of a map showing contour lines of constant UXO density with an expressed uncertainty or confidence level is a better basis for remediation decisions than a sector UXO density point estimate. A number of spatial density estimation techniques could be adapted to the UXO density estimation problem.« less

  15. Spectral Rate Theory for Two-State Kinetics

    NASA Astrophysics Data System (ADS)

    Prinz, Jan-Hendrik; Chodera, John D.; Noé, Frank

    2014-02-01

    Classical rate theories often fail in cases where the observable(s) or order parameter(s) used is a poor reaction coordinate or the observed signal is deteriorated by noise, such that no clear separation between reactants and products is possible. Here, we present a general spectral two-state rate theory for ergodic dynamical systems in thermal equilibrium that explicitly takes into account how the system is observed. The theory allows the systematic estimation errors made by standard rate theories to be understood and quantified. We also elucidate the connection of spectral rate theory with the popular Markov state modeling approach for molecular simulation studies. An optimal rate estimator is formulated that gives robust and unbiased results even for poor reaction coordinates and can be applied to both computer simulations and single-molecule experiments. No definition of a dividing surface is required. Another result of the theory is a model-free definition of the reaction coordinate quality. The reaction coordinate quality can be bounded from below by the directly computable observation quality, thus providing a measure allowing the reaction coordinate quality to be optimized by tuning the experimental setup. Additionally, the respective partial probability distributions can be obtained for the reactant and product states along the observed order parameter, even when these strongly overlap. The effects of both filtering (averaging) and uncorrelated noise are also examined. The approach is demonstrated on numerical examples and experimental single-molecule force-probe data of the p5ab RNA hairpin and the apo-myoglobin protein at low pH, focusing here on the case of two-state kinetics.

  16. Construction of mutually unbiased bases with cyclic symmetry for qubit systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seyfarth, Ulrich; Ranade, Kedar S.

    2011-10-15

    For the complete estimation of arbitrary unknown quantum states by measurements, the use of mutually unbiased bases has been well established in theory and experiment for the past 20 years. However, most constructions of these bases make heavy use of abstract algebra and the mathematical theory of finite rings and fields, and no simple and generally accessible construction is available. This is particularly true in the case of a system composed of several qubits, which is arguably the most important case in quantum information science and quantum computation. In this paper, we close this gap by providing a simple andmore » straightforward method for the construction of mutually unbiased bases in the case of a qubit register. We show that our construction is also accessible to experiments, since only Hadamard and controlled-phase gates are needed, which are available in most practical realizations of a quantum computer. Moreover, our scheme possesses the optimal scaling possible, i.e., the number of gates scales only linearly in the number of qubits.« less

  17. Counts of galaxy clusters as cosmological probes: the impact of baryonic physics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balaguera-Antolínez, Andrés; Porciani, Cristiano, E-mail: abalan@astro.uni-bonn.de, E-mail: porciani@astro.uni-bonn.de

    2013-04-01

    The halo mass function from N-body simulations of collisionless matter is generally used to retrieve cosmological parameters from observed counts of galaxy clusters. This neglects the observational fact that the baryonic mass fraction in clusters is a random variable that, on average, increases with the total mass (within an overdensity of 500). Considering a mock catalog that includes tens of thousands of galaxy clusters, as expected from the forthcoming generation of surveys, we show that the effect of a varying baryonic mass fraction will be observable with high statistical significance. The net effect is a change in the overall normalizationmore » of the cluster mass function and a milder modification of its shape. Our results indicate the necessity of taking into account baryonic corrections to the mass function if one wants to obtain unbiased estimates of the cosmological parameters from data of this quality. We introduce the formalism necessary to accomplish this goal. Our discussion is based on the conditional probability of finding a given value of the baryonic mass fraction for clusters of fixed total mass. Finally, we show that combining information from the cluster counts with measurements of the baryonic mass fraction in a small subsample of clusters (including only a few tens of objects) will nearly optimally constrain the cosmological parameters.« less

  18. The case test-negative design for studies of the effectiveness of influenza vaccine in inpatient settings.

    PubMed

    Foppa, Ivo M; Ferdinands, Jill M; Chaves, Sandra S; Haber, Michael J; Reynolds, Sue B; Flannery, Brendan; Fry, Alicia M

    2016-12-01

    The test-negative design (TND) to evaluate influenza vaccine effectiveness is based on patients seeking care for acute respiratory infection, with those who test positive for influenza as cases and the test-negatives serving as controls. This design has not been validated for the inpatient setting where selection bias might be different from an outpatient setting. We derived mathematical expressions for vaccine effectiveness (VE) against laboratory-confirmed influenza hospitalizations and used numerical simulations to verify theoretical results exploring expected biases under various scenarios. We explored meaningful interpretations of VE estimates from inpatient TND studies. VE estimates from inpatient TND studies capture the vaccine-mediated protection of the source population against laboratory-confirmed influenza hospitalizations. If vaccination does not modify disease severity, these estimates are equivalent to VE against influenza virus infection. If chronic cardiopulmonary individuals are enrolled because of non-infectious exacerbation, biased VE estimates (too high) will result. If chronic cardiopulmonary disease status is adjusted for accurately, the VE estimates will be unbiased. If chronic cardiopulmonary illness cannot be adequately be characterized, excluding these individuals may provide unbiased VE estimates. The inpatient TND offers logistic advantages and can provide valid estimates of influenza VE. If highly vaccinated patients with respiratory exacerbation of chronic cardiopulmonary conditions are eligible for study inclusion, biased VE estimates will result unless this group is well characterized and the analysis can adequately adjust for it. Otherwise, such groups of subjects should be excluded from the analysis. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.

  19. Modeling Disease Vector Occurrence when Detection Is Imperfect: Infestation of Amazonian Palm Trees by Triatomine Bugs at Three Spatial Scales

    PubMed Central

    Abad-Franch, Fernando; Ferraz, Gonçalo; Campos, Ciro; Palomeque, Francisco S.; Grijalva, Mario J.; Aguilar, H. Marcelo; Miles, Michael A.

    2010-01-01

    Background Failure to detect a disease agent or vector where it actually occurs constitutes a serious drawback in epidemiology. In the pervasive situation where no sampling technique is perfect, the explicit analytical treatment of detection failure becomes a key step in the estimation of epidemiological parameters. We illustrate this approach with a study of Attalea palm tree infestation by Rhodnius spp. (Triatominae), the most important vectors of Chagas disease (CD) in northern South America. Methodology/Principal Findings The probability of detecting triatomines in infested palms is estimated by repeatedly sampling each palm. This knowledge is used to derive an unbiased estimate of the biologically relevant probability of palm infestation. We combine maximum-likelihood analysis and information-theoretic model selection to test the relationships between environmental covariates and infestation of 298 Amazonian palm trees over three spatial scales: region within Amazonia, landscape, and individual palm. Palm infestation estimates are high (40–60%) across regions, and well above the observed infestation rate (24%). Detection probability is higher (∼0.55 on average) in the richest-soil region than elsewhere (∼0.08). Infestation estimates are similar in forest and rural areas, but lower in urban landscapes. Finally, individual palm covariates (accumulated organic matter and stem height) explain most of infestation rate variation. Conclusions/Significance Individual palm attributes appear as key drivers of infestation, suggesting that CD surveillance must incorporate local-scale knowledge and that peridomestic palm tree management might help lower transmission risk. Vector populations are probably denser in rich-soil sub-regions, where CD prevalence tends to be higher; this suggests a target for research on broad-scale risk mapping. Landscape-scale effects indicate that palm triatomine populations can endure deforestation in rural areas, but become rarer in heavily disturbed urban settings. Our methodological approach has wide application in infectious disease research; by improving eco-epidemiological parameter estimation, it can also significantly strengthen vector surveillance-control strategies. PMID:20209149

  20. Prediction and measurement results of radiation damage to CMOS devices on board spacecraft

    NASA Technical Reports Server (NTRS)

    Stassinopoulos, E. G.; Danchenko, V.; Cliff, R. A.; Sing, M.; Brucker, G. J.; Ohanian, R. S.

    1977-01-01

    Final results from the CMOS Radiation Effects Measurement (CREM) experiment flown on Explorer 55 are presented and discussed, based on about 15 months of observations and measurements. Conclusions are given relating to long-range annealing, effects of operating temperature on semiconductor performance in space, biased and unbiased P-MOS device degradation, unbiased n-channel device performance, changes in device transconductance, and the difference in ionization efficiency between Co-60 gamma rays and 1-Mev Van de Graaff electrons. The performance of devices in a heavily shielded electronic subsystem box within the spacecraft is evaluated and compared. Environment models and computational methods and their impact on device-degradation estimates are being reviewed to determine whether they permit cost-effective design of spacecraft.

  1. Recovery of Native Genetic Background in Admixed Populations Using Haplotypes, Phenotypes, and Pedigree Information – Using Cika Cattle as a Case Breed

    PubMed Central

    Simčič, Mojca; Smetko, Anamarija; Sölkner, Johann; Seichter, Doris; Gorjanc, Gregor; Kompan, Dragomir; Medugorac, Ivica

    2015-01-01

    The aim of this study was to obtain unbiased estimates of the diversity parameters, the population history, and the degree of admixture in Cika cattle which represents the local admixed breeds at risk of extinction undergoing challenging conservation programs. Genetic analyses were performed on the genome-wide Single Nucleotide Polymorphism (SNP) Illumina Bovine SNP50 array data of 76 Cika animals and 531 animals from 14 reference populations. To obtain unbiased estimates we used short haplotypes spanning four markers instead of single SNPs to avoid an ascertainment bias of the BovineSNP50 array. Genome-wide haplotypes combined with partial pedigree and type trait classification show the potential to improve identification of purebred animals with a low degree of admixture. Phylogenetic analyses demonstrated unique genetic identity of Cika animals. Genetic distance matrix presented by rooted Neighbour-Net suggested long and broad phylogenetic connection between Cika and Pinzgauer. Unsupervised clustering performed by the admixture analysis and two-dimensional presentation of the genetic distances between individuals also suggest Cika is a distinct breed despite being similar in appearance to Pinzgauer. Animals identified as the most purebred could be used as a nucleus for a recovery of the native genetic background in the current admixed population. The results show that local well-adapted strains, which have never been intensively managed and differentiated into specific breeds, exhibit large haplotype diversity. They suggest a conservation and recovery approach that does not rely exclusively on the search for the original native genetic background but rather on the identification and removal of common introgressed haplotypes would be more powerful. Successful implementation of such an approach should be based on combining phenotype, pedigree, and genome-wide haplotype data of the breed of interest and a spectrum of reference breeds which potentially have had direct or indirect historical contribution to the genetic makeup of the breed of interest. PMID:25923207

  2. Intrinsic Atomic Orbitals: An Unbiased Bridge between Quantum Theory and Chemical Concepts.

    PubMed

    Knizia, Gerald

    2013-11-12

    Modern quantum chemistry can make quantitative predictions on an immense array of chemical systems. However, the interpretation of those predictions is often complicated by the complex wave function expansions used. Here we show that an exceptionally simple algebraic construction allows for defining atomic core and valence orbitals, polarized by the molecular environment, which can exactly represent self-consistent field wave functions. This construction provides an unbiased and direct connection between quantum chemistry and empirical chemical concepts, and can be used, for example, to calculate the nature of bonding in molecules, in chemical terms, from first principles. In particular, we find consistency with electronegativities (χ), C 1s core-level shifts, resonance substituent parameters (σR), Lewis structures, and oxidation states of transition-metal complexes.

  3. Implications of a wavelength-dependent PSF for weak lensing measurements

    NASA Astrophysics Data System (ADS)

    Eriksen, Martin; Hoekstra, Henk

    2018-07-01

    The convolution of galaxy images by the point spread function (PSF) is the dominant source of bias for weak gravitational lensing studies, and an accurate estimate of the PSF is required to obtain unbiased shape measurements. The PSF estimate for a galaxy depends on its spectral energy distribution (SED), because the instrumental PSF is generally a function of the wavelength. In this paper we explore various approaches to determine the resulting `effective' PSF using broad-band data. Considering the Euclid mission as a reference, we find that standard SED template fitting methods result in biases that depend on source redshift, although this may be remedied if the algorithms can be optimized for this purpose. Using a machine learning algorithm we show that, at least in principle, the required accuracy can be achieved with the current survey parameters. It is also possible to account for the correlations between photometric redshift and PSF estimates that arise from the use of the same photometry. We explore the impact of errors in photometric calibration, errors in the assumed wavelength dependence of the PSF model, and limitations of the adopted template libraries. Our results indicate that the required accuracy for Euclid can be achieved using the data that are planned to determine photometric redshifts.

  4. Implications of a wavelength dependent PSF for weak lensing measurements.

    NASA Astrophysics Data System (ADS)

    Eriksen, Martin; Hoekstra, Henk

    2018-05-01

    The convolution of galaxy images by the point-spread function (PSF) is the dominant source of bias for weak gravitational lensing studies, and an accurate estimate of the PSF is required to obtain unbiased shape measurements. The PSF estimate for a galaxy depends on its spectral energy distribution (SED), because the instrumental PSF is generally a function of the wavelength. In this paper we explore various approaches to determine the resulting `effective' PSF using broad-band data. Considering the Euclid mission as a reference, we find that standard SED template fitting methods result in biases that depend on source redshift, although this may be remedied if the algorithms can be optimised for this purpose. Using a machine-learning algorithm we show that, at least in principle, the required accuracy can be achieved with the current survey parameters. It is also possible to account for the correlations between photometric redshift and PSF estimates that arise from the use of the same photometry. We explore the impact of errors in photometric calibration, errors in the assumed wavelength dependence of the PSF model and limitations of the adopted template libraries. Our results indicate that the required accuracy for Euclid can be achieved using the data that are planned to determine photometric redshifts.

  5. Job Search on the Internet, E-Recruitment, and Labor Market Outcomes

    DTIC Science & Technology

    2010-01-01

    CA,90407-2138 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/ MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11...United States offered the option of telecommuting at least one day a week, according to a 2008 survey by the...expression can easily deviate from zero . To estimate an unbiased estimate of  , which is the effect of 45 using the Internet, this study uses two

  6. Capture-recapture survival models taking account of transients

    USGS Publications Warehouse

    Pradel, R.; Hines, J.E.; Lebreton, J.D.; Nichols, J.D.

    1997-01-01

    The presence of transient animals, common enough in natural populations, invalidates the estimation of survival by traditional capture- recapture (CR) models designed for the study of residents only. Also, the study of transit is interesting in itself. We thus develop here a class of CR models to describe the presence of transients. In order to assess the merits of this approach we examme the bias of the traditional survival estimators in the presence of transients in relation to the power of different tests for detecting transients. We also compare the relative efficiency of an ad hoc approach to dealing with transients that leaves out the first observation of each animal. We then study a real example using lazuli bunting (Passerina amoena) and, in conclusion, discuss the design of an experiment aiming at the estimation of transience. In practice, the presence of transients is easily detected whenever the risk of bias is high. The ad hoc approach, which yields unbiased estimates for residents only, is satisfactory in a time-dependent context but poorly efficient when parameters are constant. The example shows that intermediate situations between strict 'residence' and strict 'transience' may exist in certain studies. Yet, most of the time, if the study design takes into account the expected length of stay of a transient, it should be possible to efficiently separate the two categories of animals.

  7. On the Bias-Amplifying Effect of Near Instruments in Observational Studies

    ERIC Educational Resources Information Center

    Steiner, Peter M.; Kim, Yongnam

    2014-01-01

    In contrast to randomized experiments, the estimation of unbiased treatment effects from observational data requires an analysis that conditions on all confounding covariates. Conditioning on covariates can be done via standard parametric regression techniques or nonparametric matching like propensity score (PS) matching. The regression or…

  8. An Unbiased Estimate of Global Interrater Agreement

    ERIC Educational Resources Information Center

    Cousineau, Denis; Laurencelle, Louis

    2017-01-01

    Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…

  9. Essays on Policy Evaluation with Endogenous Adoption

    ERIC Educational Resources Information Center

    Gentile, Elisabetta

    2011-01-01

    Over the last decade, experimental and quasi-experimental methods have been favored by researchers in empirical economics, as they provide unbiased causal estimates. However, when implementing a program, it is often not possible to randomly assign subjects to treatment, leading to a possible endogeneity bias. This dissertation consists of two…

  10. APPLICATION OF A MULTIPURPOSE UNEQUAL-PROBABILITY STREAM SURVEY IN THE MID-ATLANTIC COASTAL PLAIN

    EPA Science Inventory

    A stratified random sample with unequal-probability selection was used to design a multipurpose survey of headwater streams in the Mid-Atlantic Coastal Plain. Objectives for data from the survey include unbiased estimates of regional stream conditions, and adequate coverage of un...

  11. Design unbiased estimation in line intersect sampling using segmented transects

    Treesearch

    David L.R. Affleck; Timothy G. Gregoire; Harry T. Valentine; Harry T. Valentine

    2005-01-01

    In many applications of line intersect sampling. transects consist of multiple, connected segments in a prescribed configuration. The relationship between the transect configuration and the selection probability of a population element is illustrated and a consistent sampling protocol, applicable to populations composed of arbitrarily shaped elements, is proposed. It...

  12. Flood characteristics of urban watersheds in the United States

    USGS Publications Warehouse

    Sauer, Vernon B.; Thomas, W.O.; Stricker, V.A.; Wilson, K.V.

    1983-01-01

    A nationwide study of flood magnitude and frequency in urban areas was made for the purpose of reviewing available literature, compiling an urban flood data base, and developing methods of estimating urban floodflow characteristics in ungaged areas. The literature review contains synopses of 128 recent publications related to urban floodflow. A data base of 269 gaged basins in 56 cities and 31 States, including Hawaii, contains a wide variety of topographic and climatic characteristics, land-use variables, indices of urbanization, and flood-frequency estimates. Three sets of regression equations were developed to estimate flood discharges for ungaged sites for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years. Two sets of regression equations are based on seven independent parameters and the third is based on three independent parameters. The only difference in the two sets of seven-parameter equations is the use of basin lag time in one and lake and reservoir storage in the other. Of primary importance in these equations is an independent estimate of the equivalent rural discharge for the ungaged basin. The equations adjust the equivalent rural discharge to an urban condition. The primary adjustment factor, or index of urbanization, is the basin development factor, a measure of the extent of development of the drainage system in the basin. This measure includes evaluations of storm drains (sewers), channel improvements, and curb-and-gutter streets. The basin development factor is statistically very significant and offers a simple and effective way of accounting for drainage development and runoff response in urban areas. Percentage of impervious area is also included in the seven-parameter equations as an additional measure of urbanization and apparently accounts for increased runoff volumes. This factor is not highly significant for large floods, which supports the generally held concept that imperviousness is not a dominant factor when soils become more saturated during large storms. Other parameters in the seven-parameter equations include drainage area size, channel slope, rainfall intensity, lake and reservoir storage, and basin lag time. These factors are all statistically significant and provide logical indices of basin conditions. The three-parameter equations include only the three most significant parameters: rural discharge, basin-development factor, and drainage area size. All three sets of regression equations provide unbiased estimates of urban flood frequency. The seven-parameter regression equations without basin lag time have average standard errors of regression varying from ? 37 percent for the 5-year flood to ? 44 percent for the 100-year flood and ? 49 percent for the 500-year flood. The other two sets of regression equations have similar accuracy. Several tests for bias, sensitivity, and hydrologic consistency are included which support the conclusion that the equations are useful throughout the United States. All estimating equations were developed from data collected on drainage basins where temporary in-channel storage, due to highway embankments, was not significant. Consequently, estimates made with these equations do not account for the reducing effect of this temporary detention storage.

  13. Nanoslit cavity plasmonic modes and built-in fields enhance the CW THz radiation in an unbiased antennaless photomixers array.

    PubMed

    Mohammad-Zamani, Mohammad Javad; Neshat, Mohammad; Moravvej-Farshi, Mohammad Kazem

    2016-01-15

    A new generation unbiased antennaless CW terahertz (THz) photomixer emitters array made of asymmetric metal-semiconductor-metal (MSM) gratings with a subwavelength pitch, operating in the optical near-field regime, is proposed. We take advantage of size effects in near-field optics and electrostatics to demonstrate the possibility of enhancing the THz power by 4 orders of magnitude, compared to a similar unbiased antennaless array of the same size that operates in the far-field regime. We show that, with the appropriate choice of grating parameters in such THz sources, the first plasmonic resonant cavity mode in the nanoslit between two adjacent MSMs can enhance the optical near-field absorption and, hence, the generation of photocarriers under the slit in the active medium. These photocarriers, on the other hand, are accelerated by the large built-in electric field sustained under the nanoslits by two dissimilar Schottky barriers to create the desired large THz power that is mainly radiated downward. The proposed structure can be tuned in a broadband frequency range of 0.1-3 THz, with output power increasing with frequency.

  14. A Bayesian evidence synthesis approach to estimate disease prevalence in hard-to-reach populations: hepatitis C in New York City.

    PubMed

    Tan, Sarah; Makela, Susanna; Heller, Daliah; Konty, Kevin; Balter, Sharon; Zheng, Tian; Stark, James H

    2018-06-01

    Existing methods to estimate the prevalence of chronic hepatitis C (HCV) in New York City (NYC) are limited in scope and fail to assess hard-to-reach subpopulations with highest risk such as injecting drug users (IDUs). To address these limitations, we employ a Bayesian multi-parameter evidence synthesis model to systematically combine multiple sources of data, account for bias in certain data sources, and provide unbiased HCV prevalence estimates with associated uncertainty. Our approach improves on previous estimates by explicitly accounting for injecting drug use and including data from high-risk subpopulations such as the incarcerated, and is more inclusive, utilizing ten NYC data sources. In addition, we derive two new equations to allow age at first injecting drug use data for former and current IDUs to be incorporated into the Bayesian evidence synthesis, a first for this type of model. Our estimated overall HCV prevalence as of 2012 among NYC adults aged 20-59 years is 2.78% (95% CI 2.61-2.94%), which represents between 124,900 and 140,000 chronic HCV cases. These estimates suggest that HCV prevalence in NYC is higher than previously indicated from household surveys (2.2%) and the surveillance system (2.37%), and that HCV transmission is increasing among young injecting adults in NYC. An ancillary benefit from our results is an estimate of current IDUs aged 20-59 in NYC: 0.58% or 27,600 individuals. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Errors in the estimation of the variance: implications for multiple-probability fluctuation analysis.

    PubMed

    Saviane, Chiara; Silver, R Angus

    2006-06-15

    Synapses play a crucial role in information processing in the brain. Amplitude fluctuations of synaptic responses can be used to extract information about the mechanisms underlying synaptic transmission and its modulation. In particular, multiple-probability fluctuation analysis can be used to estimate the number of functional release sites, the mean probability of release and the amplitude of the mean quantal response from fits of the relationship between the variance and mean amplitude of postsynaptic responses, recorded at different probabilities. To determine these quantal parameters, calculate their uncertainties and the goodness-of-fit of the model, it is important to weight the contribution of each data point in the fitting procedure. We therefore investigated the errors associated with measuring the variance by determining the best estimators of the variance of the variance and have used simulations of synaptic transmission to test their accuracy and reliability under different experimental conditions. For central synapses, which generally have a low number of release sites, the amplitude distribution of synaptic responses is not normal, thus the use of a theoretical variance of the variance based on the normal assumption is not a good approximation. However, appropriate estimators can be derived for the population and for limited sample sizes using a more general expression that involves higher moments and introducing unbiased estimators based on the h-statistics. Our results are likely to be relevant for various applications of fluctuation analysis when few channels or release sites are present.

  16. LENSED: a code for the forward reconstruction of lenses and sources from strong lensing observations

    NASA Astrophysics Data System (ADS)

    Tessore, Nicolas; Bellagamba, Fabio; Metcalf, R. Benton

    2016-12-01

    Robust modelling of strong lensing systems is fundamental to exploit the information they contain about the distribution of matter in galaxies and clusters. In this work, we present LENSED, a new code which performs forward parametric modelling of strong lenses. LENSED takes advantage of a massively parallel ray-tracing kernel to perform the necessary calculations on a modern graphics processing unit (GPU). This makes the precise rendering of the background lensed sources much faster, and allows the simultaneous optimization of tens of parameters for the selected model. With a single run, the code is able to obtain the full posterior probability distribution for the lens light, the mass distribution and the background source at the same time. LENSED is first tested on mock images which reproduce realistic space-based observations of lensing systems. In this way, we show that it is able to recover unbiased estimates of the lens parameters, even when the sources do not follow exactly the assumed model. Then, we apply it to a subsample of the Sloan Lens ACS Survey lenses, in order to demonstrate its use on real data. The results generally agree with the literature, and highlight the flexibility and robustness of the algorithm.

  17. Structure of the Large Magellanic Cloud from near infrared magnitudes of red clump stars

    NASA Astrophysics Data System (ADS)

    Subramanian, S.; Subramaniam, A.

    2013-04-01

    Context. The structural parameters of the disk of the Large Magellanic Cloud (LMC) are estimated. Aims: We used the JH photometric data of red clump (RC) stars from the Magellanic Cloud Point Source Catalog (MCPSC) obtained from the InfraRed Survey Facility (IRSF) to estimate the structural parameters of the LMC disk, such as the inclination, i, and the position angle of the line of nodes (PAlon), φ. Methods: The observed LMC region is divided into several sub-regions, and stars in each region are cross-identified with the optically identified RC stars to obtain the near infrared magnitudes. The peak values of H magnitude and (J - H) colour of the observed RC distribution are obtained by fitting a profile to the distributions and by taking the average value of magnitude and colour of the RC stars in the bin with largest number. Then the dereddened peak H0 magnitude of the RC stars in each sub-region is obtained from the peak values of H magnitude and (J - H) colour of the observed RC distribution. The right ascension (RA), declination (Dec), and relative distance from the centre of each sub-region are converted into x,y, and z Cartesian coordinates. A weighted least square plane fitting method is applied to this x,y,z data to estimate the structural parameters of the LMC disk. Results: An intrinsic (J - H)0 colour of 0.40 ± 0.03 mag in the Simultaneous three-colour InfraRed Imager for Unbiased Survey (SIRIUS) IRSF filter system is estimated for the RC stars in the LMC and a reddening map based on (J - H) colour of the RC stars is presented. When the peaks of the RC distribution were identified by averaging, an inclination of 25°.7 ± 1°.6 and a PAlon = 141°.5 ± 4°.5 were obtained. We estimate a distance modulus, μ = 18.47 ± 0.1 mag to the LMC. Extra-planar features which are both in front and behind the fitted plane are identified. They match with the optically identified extra-planar features. The bar of the LMC is found to be part of the disk within 500 pc. Conclusions: The estimates of the structural parameters are found to be independent of the photometric bands used for the analysis. The radial variation of the structural parameters are also studied. We find that the inner disk, within ~3°.0, is less inclined and has a larger value of PAlon when compared to the outer disk. Our estimates are compared with the literature values, and the possible reasons for the small discrepancies found are discussed.

  18. Nonlinear Computerized Methodology. A. Angle of Arrival Estimation. B. Data Modeling and Identification

    DTIC Science & Technology

    1991-06-10

    essentially In the Wianer- Ville distribution ( WVD ). A preliminary analysis indicates that the simple operation of autoconvolution can enhance spectral...many troublesome cases as a supplement to MUSIC (and its adaptations) and as a simple alternative (or representation of) the Wigner - Ville ... WVD is a time-frequency distribution which provides an unbiased spectrum estimate by W(t,W) = f H,(u) X (t - u/2) X (t + u/2) e -iwu du , where the

  19. Ways to improve your correlation functions

    NASA Technical Reports Server (NTRS)

    Hamilton, A. J. S.

    1993-01-01

    This paper describes a number of ways to improve on the standard method for measuring the two-point correlation function of large scale structure in the Universe. Issues addressed are: (1) the problem of the mean density, and how to solve it; (2) how to estimate the uncertainty in a measured correlation function; (3) minimum variance pair weighting; (4) unbiased estimation of the selection function when magnitudes are discrete; and (5) analytic computation of angular integrals in background pair counts.

  20. Sparse Recovery via Differential Inclusions

    DTIC Science & Technology

    2014-07-01

    2242. [Wai09] Martin J. Wainwright, Sharp thresholds for high-dimensional and noisy spar- sity recovery using l1 -constrained quadratic programming...solution, (1.11) βt = { 0, if t < 1/y; y(1− e−κ(t−1/y)), otherwise, which converges to the unbiased Bregman ISS estimator exponentially fast. Let us ...are not given the support set S, so the following two prop- erties are used to evaluate the performance of an estimator β̂. 1. Model selection

  1. A New Model for the Estimation of Cell Proliferation Dynamics Using CFSE Data

    DTIC Science & Technology

    2011-08-20

    cells, and hence into the resulting division and death rates . Alternatively, we propose that there is information to be learned not only from...meaningful estimation of population proliferation and death rates in a manner which is unbiased and mechanistically sound. Significantly, this new model is...change in permitting the dependence of the proliferation and death rates (α and β) and the label loss rate (v) on both time t and measured FI x. This

  2. Effects of sample size on estimates of population growth rates calculated with matrix models.

    PubMed

    Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M

    2008-08-28

    Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.

  3. Quantifying the accretion of hyperphosphorylated tau in the locus coeruleus and dorsal raphe nucleus: the pathological building blocks of early Alzheimer's disease.

    PubMed

    Ehrenberg, A J; Nguy, A K; Theofilas, P; Dunlop, S; Suemoto, C K; Di Lorenzo Alho, A T; Leite, R P; Diehl Rodriguez, R; Mejia, M B; Rüb, U; Farfel, J M; de Lucena Ferretti-Rebustini, R E; Nascimento, C F; Nitrini, R; Pasquallucci, C A; Jacob-Filho, W; Miller, B; Seeley, W W; Heinsen, H; Grinberg, L T

    2017-08-01

    Hyperphosphorylated tau neuronal cytoplasmic inclusions (ht-NCI) are the best protein correlate of clinical decline in Alzheimer's disease (AD). Qualitative evidence identifies ht-NCI accumulating in the isodendritic core before the entorhinal cortex. Here, we used unbiased stereology to quantify ht-NCI burden in the locus coeruleus (LC) and dorsal raphe nucleus (DRN), aiming to characterize the impact of AD pathology in these nuclei with a focus on early stages. We utilized unbiased stereology in a sample of 48 well-characterized subjects enriched for controls and early AD stages. ht-NCI counts were estimated in 60-μm-thick sections immunostained for p-tau throughout LC and DRN. Data were integrated with unbiased estimates of LC and DRN neuronal population for a subset of cases. In Braak stage 0, 7.9% and 2.6% of neurons in LC and DRN, respectively, harbour ht-NCIs. Although the number of ht-NCI+ neurons significantly increased by about 1.9× between Braak stages 0 to I in LC (P = 0.02), we failed to detect any significant difference between Braak stage I and II. Also, the number of ht-NCI+ neurons remained stable in DRN between all stages 0 and II. Finally, the differential susceptibility to tau inclusions among nuclear subdivisions was more notable in LC than in DRN. LC and DRN neurons exhibited ht-NCI during AD precortical stages. The ht-NCI increases along AD progression on both nuclei, but quantitative changes in LC precede DRN changes. © 2017 British Neuropathological Society.

  4. The geostatistical approach for structural and stratigraphic framework analysis of offshore NW Bonaparte Basin, Australia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wahid, Ali, E-mail: ali.wahid@live.com; Salim, Ahmed Mohamed Ahmed, E-mail: mohamed.salim@petronas.com.my; Yusoff, Wan Ismail Wan, E-mail: wanismail-wanyusoff@petronas.com.my

    2016-02-01

    Geostatistics or statistical approach is based on the studies of temporal and spatial trend, which depend upon spatial relationships to model known information of variable(s) at unsampled locations. The statistical technique known as kriging was used for petrophycial and facies analysis, which help to assume spatial relationship to model the geological continuity between the known data and the unknown to produce a single best guess of the unknown. Kriging is also known as optimal interpolation technique, which facilitate to generate best linear unbiased estimation of each horizon. The idea is to construct a numerical model of the lithofacies and rockmore » properties that honor available data and further integrate with interpreting seismic sections, techtonostratigraphy chart with sea level curve (short term) and regional tectonics of the study area to find the structural and stratigraphic growth history of the NW Bonaparte Basin. By using kriging technique the models were built which help to estimate different parameters like horizons, facies, and porosities in the study area. The variograms were used to determine for identification of spatial relationship between data which help to find the depositional history of the North West (NW) Bonaparte Basin.« less

  5. Introduction to symposium on unmeasured heterogeneity in school transition models

    PubMed Central

    Mare, Robert D.

    2014-01-01

    Researchers have used models of school transitions for over 30 years to describe inequality of educational opportunity and have contributed a number of important refinements and extensions. School transition models have the complication that the estimated effects of family background on the probability of continuing in school are affected by differential attrition on unobserved factors at earlier stages of schooling. The articles in this symposium present a variety of useful approaches to unobserved heterogeneity in school transition models. Investigators who use these approaches should attend to several issues: (1) models for school transitions may be used both descriptively (and are not therefore subject to any well-defined “bias”) and as tools for causal inference. (2) The concept of bias presupposes an underlying experiment, structural model, or population model that would, in principle, define the corresponding unbiased parameters – yet these underlying models are difficult to specify for school transition models. (3) Unobserved determinants of whether individuals make school transitions may be both exogenous and endogenous with respect to the observed regressors in the model. Without a model of how unobserved heterogeneity arises, attempted “corrections” for unmeasured heterogeneity may yield misleading estimates of the effects of measured determinants of school continuation. PMID:24882915

  6. Introduction to symposium on unmeasured heterogeneity in school transition models.

    PubMed

    Mare, Robert D

    2011-09-01

    Researchers have used models of school transitions for over 30 years to describe inequality of educational opportunity and have contributed a number of important refinements and extensions. School transition models have the complication that the estimated effects of family background on the probability of continuing in school are affected by differential attrition on unobserved factors at earlier stages of schooling. The articles in this symposium present a variety of useful approaches to unobserved heterogeneity in school transition models. Investigators who use these approaches should attend to several issues: (1) models for school transitions may be used both descriptively (and are not therefore subject to any well-defined "bias") and as tools for causal inference. (2) The concept of bias presupposes an underlying experiment, structural model, or population model that would, in principle, define the corresponding unbiased parameters - yet these underlying models are difficult to specify for school transition models. (3) Unobserved determinants of whether individuals make school transitions may be both exogenous and endogenous with respect to the observed regressors in the model. Without a model of how unobserved heterogeneity arises, attempted "corrections" for unmeasured heterogeneity may yield misleading estimates of the effects of measured determinants of school continuation.

  7. Demonstration of an all-optical feed-forward delay line buffer using the quadratic Stark effect and two-photon absorption in an SOA.

    PubMed

    Soto, Horacio; Tong, Miriam A; Domínguez, Juan C; Muraoka, Ramón

    2017-09-04

    We have inserted into an unbiased semiconductor optical amplifier (SOA) a powerful control beam, with photon energy slightly smaller than that of the band-gap of its active region, for exciting two-photon absorption and the quadratic Stark effect. For the available SOA, we estimated these phenomena generated a nonlinear absorption coefficient β= -865 cm/GW and induced an appreciable birefringence inside the amplifier waveguide, which significantly modified the polarization-state of a probe beam. Based on these effects, we have experimentally demonstrated the operation of an all-optical buffer, using an 80 Gb/s optical pulse comb, as well as an unbiased SOA, which was therefore, devoid of amplified spontaneous emission and pattern effects.

  8. Unbiased Causal Inference from an Observational Study: Results of a Within-Study Comparison

    ERIC Educational Resources Information Center

    Pohl, Steffi; Steiner, Peter M.; Eisermann, Jens; Soellner, Renate; Cook, Thomas D.

    2009-01-01

    Adjustment methods such as propensity scores and analysis of covariance are often used for estimating treatment effects in nonexperimental data. Shadish, Clark, and Steiner used a within-study comparison to test how well these adjustments work in practice. They randomly assigned participating students to a randomized or nonrandomized experiment.…

  9. ACCOUNTING FOR THE ENDOGENEITY OF HEALTH AND ENVIRONMENTAL TOBACCO SMOKE EXPOSURE IN CHILDREN: AN APPLICATION TO CONTINUOUS LUNG FUNCTION

    EPA Science Inventory

    The goal of this study is to estimate an unbiased exposure effect of environmental tobacco smoke (ETS) exposure on children's continuous lung function. A majority of the evidence from health studies suggests that ETS exposure in early life contributes significantly to childhood ...

  10. Multi-Armed RCTs: A Design-Based Framework. NCEE 2017-4027

    ERIC Educational Resources Information Center

    Schochet, Peter Z.

    2017-01-01

    Design-based methods have recently been developed as a way to analyze data from impact evaluations of interventions, programs, and policies (Imbens and Rubin, 2015; Schochet, 2015, 2016). The estimators are derived using the building blocks of experimental designs with minimal assumptions, and are unbiased and normally distributed in large samples…

  11. Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods

    ERIC Educational Resources Information Center

    Zhong, Xiaoling; Yuan, Ke-Hai

    2011-01-01

    In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…

  12. On Measuring and Reducing Selection Bias with a Quasi-Doubly Randomized Preference Trial

    ERIC Educational Resources Information Center

    Joyce, Ted; Remler, Dahlia K.; Jaeger, David A.; Altindag, Onur; O'Connell, Stephen D.; Crockett, Sean

    2017-01-01

    Randomized experiments provide unbiased estimates of treatment effects, but are costly and time consuming. We demonstrate how a randomized experiment can be leveraged to measure selection bias by conducting a subsequent observational study that is identical in every way except that subjects choose their treatment--a quasi-doubly randomized…

  13. Mutually unbiased product bases for multiple qudits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McNulty, Daniel; Pammer, Bogdan; Weigert, Stefan

    We investigate the interplay between mutual unbiasedness and product bases for multiple qudits of possibly different dimensions. A product state of such a system is shown to be mutually unbiased to a product basis only if each of its factors is mutually unbiased to all the states which occur in the corresponding factors of the product basis. This result implies both a tight limit on the number of mutually unbiased product bases which the system can support and a complete classification of mutually unbiased product bases for multiple qubits or qutrits. In addition, only maximally entangled states can be mutuallymore » unbiased to a maximal set of mutually unbiased product bases.« less

  14. A statistical approach to quasi-extinction forecasting.

    PubMed

    Holmes, Elizabeth Eli; Sabo, John L; Viscido, Steven Vincent; Fagan, William Fredric

    2007-12-01

    Forecasting population decline to a certain critical threshold (the quasi-extinction risk) is one of the central objectives of population viability analysis (PVA), and such predictions figure prominently in the decisions of major conservation organizations. In this paper, we argue that accurate forecasting of a population's quasi-extinction risk does not necessarily require knowledge of the underlying biological mechanisms. Because of the stochastic and multiplicative nature of population growth, the ensemble behaviour of population trajectories converges to common statistical forms across a wide variety of stochastic population processes. This paper provides a theoretical basis for this argument. We show that the quasi-extinction surfaces of a variety of complex stochastic population processes (including age-structured, density-dependent and spatially structured populations) can be modelled by a simple stochastic approximation: the stochastic exponential growth process overlaid with Gaussian errors. Using simulated and real data, we show that this model can be estimated with 20-30 years of data and can provide relatively unbiased quasi-extinction risk with confidence intervals considerably smaller than (0,1). This was found to be true even for simulated data derived from some of the noisiest population processes (density-dependent feedback, species interactions and strong age-structure cycling). A key advantage of statistical models is that their parameters and the uncertainty of those parameters can be estimated from time series data using standard statistical methods. In contrast for most species of conservation concern, biologically realistic models must often be specified rather than estimated because of the limited data available for all the various parameters. Biologically realistic models will always have a prominent place in PVA for evaluating specific management options which affect a single segment of a population, a single demographic rate, or different geographic areas. However, for forecasting quasi-extinction risk, statistical models that are based on the convergent statistical properties of population processes offer many advantages over biologically realistic models.

  15. Bayesian State-Space Modelling of Conventional Acoustic Tracking Provides Accurate Descriptors of Home Range Behavior in a Small-Bodied Coastal Fish Species

    PubMed Central

    Alós, Josep; Palmer, Miquel; Balle, Salvador; Arlinghaus, Robert

    2016-01-01

    State-space models (SSM) are increasingly applied in studies involving biotelemetry-generated positional data because they are able to estimate movement parameters from positions that are unobserved or have been observed with non-negligible observational error. Popular telemetry systems in marine coastal fish consist of arrays of omnidirectional acoustic receivers, which generate a multivariate time-series of detection events across the tracking period. Here we report a novel Bayesian fitting of a SSM application that couples mechanistic movement properties within a home range (a specific case of random walk weighted by an Ornstein-Uhlenbeck process) with a model of observational error typical for data obtained from acoustic receiver arrays. We explored the performance and accuracy of the approach through simulation modelling and extensive sensitivity analyses of the effects of various configurations of movement properties and time-steps among positions. Model results show an accurate and unbiased estimation of the movement parameters, and in most cases the simulated movement parameters were properly retrieved. Only in extreme situations (when fast swimming speeds are combined with pooling the number of detections over long time-steps) the model produced some bias that needs to be accounted for in field applications. Our method was subsequently applied to real acoustic tracking data collected from a small marine coastal fish species, the pearly razorfish, Xyrichtys novacula. The Bayesian SSM we present here constitutes an alternative for those used to the Bayesian way of reasoning. Our Bayesian SSM can be easily adapted and generalized to any species, thereby allowing studies in freely roaming animals on the ecological and evolutionary consequences of home ranges and territory establishment, both in fishes and in other taxa. PMID:27119718

  16. Narrow-sense heritability estimation of complex traits using identity-by-descent information.

    PubMed

    Evans, Luke M; Tahmasbi, Rasool; Jones, Matt; Vrieze, Scott I; Abecasis, Gonçalo R; Das, Sayantan; Bjelland, Douglas W; de Candia, Teresa R; Yang, Jian; Goddard, Michael E; Visscher, Peter M; Keller, Matthew C

    2018-03-28

    Heritability is a fundamental parameter in genetics. Traditional estimates based on family or twin studies can be biased due to shared environmental or non-additive genetic variance. Alternatively, those based on genotyped or imputed variants typically underestimate narrow-sense heritability contributed by rare or otherwise poorly tagged causal variants. Identical-by-descent (IBD) segments of the genome share all variants between pairs of chromosomes except new mutations that have arisen since the last common ancestor. Therefore, relating phenotypic similarity to degree of IBD sharing among classically unrelated individuals is an appealing approach to estimating the near full additive genetic variance while possibly avoiding biases that can occur when modeling close relatives. We applied an IBD-based approach (GREML-IBD) to estimate heritability in unrelated individuals using phenotypic simulation with thousands of whole-genome sequences across a range of stratification, polygenicity levels, and the minor allele frequencies of causal variants (CVs). In simulations, the IBD-based approach produced unbiased heritability estimates, even when CVs were extremely rare, although precision was low. However, population stratification and non-genetic familial environmental effects shared across generations led to strong biases in IBD-based heritability. We used data on two traits in ~120,000 people from the UK Biobank to demonstrate that, depending on the trait and possible confounding environmental effects, GREML-IBD can be applied to very large genetic datasets to infer the contribution of very rare variants lost using other methods. However, we observed apparent biases in these real data, suggesting that more work may be required to understand and mitigate factors that influence IBD-based heritability estimates.

  17. An approach to checking case-crossover analyses based on equivalence with time-series methods.

    PubMed

    Lu, Yun; Symons, James Morel; Geyh, Alison S; Zeger, Scott L

    2008-03-01

    The case-crossover design has been increasingly applied to epidemiologic investigations of acute adverse health effects associated with ambient air pollution. The correspondence of the design to that of matched case-control studies makes it inferentially appealing for epidemiologic studies. Case-crossover analyses generally use conditional logistic regression modeling. This technique is equivalent to time-series log-linear regression models when there is a common exposure across individuals, as in air pollution studies. Previous methods for obtaining unbiased estimates for case-crossover analyses have assumed that time-varying risk factors are constant within reference windows. In this paper, we rely on the connection between case-crossover and time-series methods to illustrate model-checking procedures from log-linear model diagnostics for time-stratified case-crossover analyses. Additionally, we compare the relative performance of the time-stratified case-crossover approach to time-series methods under 3 simulated scenarios representing different temporal patterns of daily mortality associated with air pollution in Chicago, Illinois, during 1995 and 1996. Whenever a model-be it time-series or case-crossover-fails to account appropriately for fluctuations in time that confound the exposure, the effect estimate will be biased. It is therefore important to perform model-checking in time-stratified case-crossover analyses rather than assume the estimator is unbiased.

  18. Estimating cell populations

    NASA Technical Reports Server (NTRS)

    White, B. S.; Castleman, K. R.

    1981-01-01

    An important step in the diagnosis of a cervical cytology specimen is estimating the proportions of the various cell types present. This is usually done with a cell classifier, the error rates of which can be expressed as a confusion matrix. We show how to use the confusion matrix to obtain an unbiased estimate of the desired proportions. We show that the mean square error of this estimate depends on a 'befuddlement matrix' derived from the confusion matrix, and how this, in turn, leads to a figure of merit for cell classifiers. Finally, we work out the two-class problem in detail and present examples to illustrate the theory.

  19. Noncommuting observables in quantum detection and estimation theory

    NASA Technical Reports Server (NTRS)

    Helstrom, C. W.

    1972-01-01

    Basing decisions and estimates on simultaneous approximate measurements of noncommuting observables in a quantum receiver is shown to be equivalent to measuring commuting projection operators on a larger Hilbert space than that of the receiver itself. The quantum-mechanical Cramer-Rao inequalities derived from right logarithmic derivatives and symmetrized logarithmic derivatives of the density operator are compared, and it is shown that the latter give superior lower bounds on the error variances of individual unbiased estimates of arrival time and carrier frequency of a coherent signal. For a suitably weighted sum of the error variances of simultaneous estimates of these, the former yield the superior lower bound under some conditions.

  20. Simulation of design-unbiased point-to-particle sampling compared to alternatives on plantation rows

    Treesearch

    Thomas B. Lynch; David Hamlin; Mark J. Ducey

    2016-01-01

    Total quantities of tree attributes can be estimated in plantations by sampling on plantation rows using several methods. At random sample points on a row, either fixed row lengths or variable row lengths with a fixed number of sample trees can be assessed. Ratio of means or mean of ratios estimators can be developed for the fixed number of trees option but are not...

  1. Cosmographic analysis with Chebyshev polynomials

    NASA Astrophysics Data System (ADS)

    Capozziello, Salvatore; D'Agostino, Rocco; Luongo, Orlando

    2018-05-01

    The limits of standard cosmography are here revised addressing the problem of error propagation during statistical analyses. To do so, we propose the use of Chebyshev polynomials to parametrize cosmic distances. In particular, we demonstrate that building up rational Chebyshev polynomials significantly reduces error propagations with respect to standard Taylor series. This technique provides unbiased estimations of the cosmographic parameters and performs significatively better than previous numerical approximations. To figure this out, we compare rational Chebyshev polynomials with Padé series. In addition, we theoretically evaluate the convergence radius of (1,1) Chebyshev rational polynomial and we compare it with the convergence radii of Taylor and Padé approximations. We thus focus on regions in which convergence of Chebyshev rational functions is better than standard approaches. With this recipe, as high-redshift data are employed, rational Chebyshev polynomials remain highly stable and enable one to derive highly accurate analytical approximations of Hubble's rate in terms of the cosmographic series. Finally, we check our theoretical predictions by setting bounds on cosmographic parameters through Monte Carlo integration techniques, based on the Metropolis-Hastings algorithm. We apply our technique to high-redshift cosmic data, using the Joint Light-curve Analysis supernovae sample and the most recent versions of Hubble parameter and baryon acoustic oscillation measurements. We find that cosmography with Taylor series fails to be predictive with the aforementioned data sets, while turns out to be much more stable using the Chebyshev approach.

  2. When Unbiased Probabilistic Learning Is Not Enough: Acquiring a Parametric System of Metrical Phonology

    ERIC Educational Resources Information Center

    Pearl, Lisa S.

    2011-01-01

    Parametric systems have been proposed as models of how humans represent knowledge about language, motivated in part as a way to explain children's rapid acquisition of linguistic knowledge. Given this, it seems reasonable to examine if children with knowledge of parameters could in fact acquire the adult system from the data available to them.…

  3. CARES/PC - CERAMICS ANALYSIS AND RELIABILITY EVALUATION OF STRUCTURES

    NASA Technical Reports Server (NTRS)

    Szatmary, S. A.

    1994-01-01

    The beneficial properties of structural ceramics include their high-temperature strength, light weight, hardness, and corrosion and oxidation resistance. For advanced heat engines, ceramics have demonstrated functional abilities at temperatures well beyond the operational limits of metals. This is offset by the fact that ceramic materials tend to be brittle. When a load is applied, their lack of significant plastic deformation causes the material to crack at microscopic flaws, destroying the component. CARES/PC performs statistical analysis of data obtained from the fracture of simple, uniaxial tensile or flexural specimens and estimates the Weibull and Batdorf material parameters from this data. CARES/PC is a subset of the program CARES (COSMIC program number LEW-15168) which calculates the fast-fracture reliability or failure probability of ceramic components utilizing the Batdorf and Weibull models to describe the effects of multi-axial stress states on material strength. CARES additionally requires that the ceramic structure be modeled by a finite element program such as MSC/NASTRAN or ANSYS. The more limited CARES/PC does not perform fast-fracture reliability estimation of components. CARES/PC estimates ceramic material properties from uniaxial tensile or from three- and four-point bend bar data. In general, the parameters are obtained from the fracture stresses of many specimens (30 or more are recommended) whose geometry and loading configurations are held constant. Parameter estimation can be performed for single or multiple failure modes by using the least-squares analysis or the maximum likelihood method. Kolmogorov-Smirnov and Anderson-Darling goodness-of-fit tests measure the accuracy of the hypothesis that the fracture data comes from a population with a distribution specified by the estimated Weibull parameters. Ninety-percent confidence intervals on the Weibull parameters and the unbiased value of the shape parameter for complete samples are provided when the maximum likelihood technique is used. CARES/PC is written and compiled with the Microsoft FORTRAN v5.0 compiler using the VAX FORTRAN extensions and dynamic array allocation supported by this compiler for the IBM/MS-DOS or OS/2 operating systems. The dynamic array allocation routines allow the user to match the number of fracture sets and test specimens to the memory available. Machine requirements include IBM PC compatibles with optional math coprocessor. Program output is designed to fit 80-column format printers. Executables for both DOS and OS/2 are provided. CARES/PC is distributed on one 5.25 inch 360K MS-DOS format diskette in compressed format. The expansion tool PKUNZIP.EXE is supplied on the diskette. CARES/PC was developed in 1990. IBM PC and OS/2 are trademarks of International Business Machines. MS-DOS and MS OS/2 are trademarks of Microsoft Corporation. VAX is a trademark of Digital Equipment Corporation.

  4. A Sparse Matrix Approach for Simultaneous Quantification of Nystagmus and Saccade

    NASA Technical Reports Server (NTRS)

    Kukreja, Sunil L.; Stone, Lee; Boyle, Richard D.

    2012-01-01

    The vestibulo-ocular reflex (VOR) consists of two intermingled non-linear subsystems; namely, nystagmus and saccade. Typically, nystagmus is analysed using a single sufficiently long signal or a concatenation of them. Saccade information is not analysed and discarded due to insufficient data length to provide consistent and minimum variance estimates. This paper presents a novel sparse matrix approach to system identification of the VOR. It allows for the simultaneous estimation of both nystagmus and saccade signals. We show via simulation of the VOR that our technique provides consistent and unbiased estimates in the presence of output additive noise.

  5. Evaluation of modal pushover-based scaling of one component of ground motion: Tall buildings

    USGS Publications Warehouse

    Kalkan, Erol; Chopra, Anil K.

    2012-01-01

    Nonlinear response history analysis (RHA) is now increasingly used for performance-based seismic design of tall buildings. Required for nonlinear RHAs is a set of ground motions selected and scaled appropriately so that analysis results would be accurate (unbiased) and efficient (having relatively small dispersion). This paper evaluates accuracy and efficiency of recently developed modal pushover–based scaling (MPS) method to scale ground motions for tall buildings. The procedure presented explicitly considers structural strength and is based on the standard intensity measure (IM) of spectral acceleration in a form convenient for evaluating existing structures or proposed designs for new structures. Based on results presented for two actual buildings (19 and 52 stories, respectively), it is demonstrated that the MPS procedure provided a highly accurate estimate of the engineering demand parameters (EDPs), accompanied by significantly reduced record-to-record variability of the responses. In addition, the MPS procedure is shown to be superior to the scaling procedure specified in the ASCE/SEI 7-05 document.

  6. Fast and accurate Monte Carlo sampling of first-passage times from Wiener diffusion models.

    PubMed

    Drugowitsch, Jan

    2016-02-11

    We present a new, fast approach for drawing boundary crossing samples from Wiener diffusion models. Diffusion models are widely applied to model choices and reaction times in two-choice decisions. Samples from these models can be used to simulate the choices and reaction times they predict. These samples, in turn, can be utilized to adjust the models' parameters to match observed behavior from humans and other animals. Usually, such samples are drawn by simulating a stochastic differential equation in discrete time steps, which is slow and leads to biases in the reaction time estimates. Our method, instead, facilitates known expressions for first-passage time densities, which results in unbiased, exact samples and a hundred to thousand-fold speed increase in typical situations. In its most basic form it is restricted to diffusion models with symmetric boundaries and non-leaky accumulation, but our approach can be extended to also handle asymmetric boundaries or to approximate leaky accumulation.

  7. A Comparison of Composite Reliability Estimators: Coefficient Omega Confidence Intervals in the Current Literature

    ERIC Educational Resources Information Center

    Padilla, Miguel A.; Divers, Jasmin

    2016-01-01

    Coefficient omega and alpha are both measures of the composite reliability for a set of items. Unlike coefficient alpha, coefficient omega remains unbiased with congeneric items with uncorrelated errors. Despite this ability, coefficient omega is not as widely used and cited in the literature as coefficient alpha. Reasons for coefficient omega's…

  8. Personalized recommendation based on unbiased consistence

    NASA Astrophysics Data System (ADS)

    Zhu, Xuzhen; Tian, Hui; Zhang, Ping; Hu, Zheng; Zhou, Tao

    2015-08-01

    Recently, in physical dynamics, mass-diffusion-based recommendation algorithms on bipartite network provide an efficient solution by automatically pushing possible relevant items to users according to their past preferences. However, traditional mass-diffusion-based algorithms just focus on unidirectional mass diffusion from objects having been collected to those which should be recommended, resulting in a biased causal similarity estimation and not-so-good performance. In this letter, we argue that in many cases, a user's interests are stable, and thus bidirectional mass diffusion abilities, no matter originated from objects having been collected or from those which should be recommended, should be consistently powerful, showing unbiased consistence. We further propose a consistence-based mass diffusion algorithm via bidirectional diffusion against biased causality, outperforming the state-of-the-art recommendation algorithms in disparate real data sets, including Netflix, MovieLens, Amazon and Rate Your Music.

  9. Performance study for a set of BLUE based Filters applied to amplitude estimation using as a reference the single photoelectron signal of the ν-Angra Experiment

    NASA Astrophysics Data System (ADS)

    Souza, D. M.; Costa, I. A.; Nóbrega, R. A.

    2017-10-01

    This document presents a detailed study of the performance of a set of digital filters whose implementations are based on the best linear unbiased estimator theory interpreted as a constrained optimization problem that could be relaxed depending on the input signal characteristics. This approach has been employed by a number of recent particle physics experiments for measuring the energy of particle events interacting with their detectors. The considered filters have been designed to measure the peak amplitude of signals produced by their detectors based on the digitized version of such signals. This study provides a clear understanding of the characteristics of those filters in the context of particle physics and, additionally, it proposes a phase related constraint based on the second derivative of the Taylor expansion in order to make the estimator less sensitive to phase variation (phase between the analog signal shaping and its sampled version), which is stronger in asynchronous experiments. The asynchronous detector developed by the ν-Angra Collaboration is used as the basis for this work. Nevertheless, the proposed analysis goes beyond, considering a wide range of conditions related to signal parameters such as pedestal, phase, sampling rate, amplitude resolution, noise and pile-up; therefore crossing the bounds of the ν-Angra Experiment to make it interesting and useful for different asynchronous and even synchronous experiments.

  10. GTARG - The TOPEX/Poseidon ground track maintenance maneuver targeting program

    NASA Technical Reports Server (NTRS)

    Shapiro, Bruce E.; Bhat, Ramachandra S.

    1993-01-01

    GTARG is a computer program used to design orbit maintenance maneuvers for the TOPEX/Poseidon satellite. These maneuvers ensure that the ground track is kept within +/-1 km with of an = 9.9 day exact repeat pattern. Maneuver parameters are determined using either of two targeting strategies: longitude targeting, which maximizes the time between maneuvers, and time targeting, in which maneuvers are targeted to occur at specific intervals. The GTARG algorithm propagates nonsingular mean elements, taking into account anticipated error sigma's in orbit determination, Delta v execution, drag prediction and Delta v quantization. A satellite unique drag model is used which incorporates an approximate mean orbital Jacchia-Roberts atmosphere and a variable mean area model. Maneuver Delta v magnitudes are targeted to precisely maintain either the unbiased ground track itself, or a comfortable (3 sigma) error envelope about the unbiased ground track.

  11. The Swift GRB Host Galaxy Legacy Survey

    NASA Astrophysics Data System (ADS)

    Perley, Daniel A.

    2015-01-01

    I introduce the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population across its entire redshift range. Using unbiased selection criteria we have designated a subset of 130 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, and Gemini to obtain complementary optical/NIR photometry to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass functions and their evolution with redshift between z=0 and z=5, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to probe cosmic star-formation.

  12. A bias-corrected estimator in multiple imputation for missing data.

    PubMed

    Tomita, Hiroaki; Fujisawa, Hironori; Henmi, Masayuki

    2018-05-29

    Multiple imputation (MI) is one of the most popular methods to deal with missing data, and its use has been rapidly increasing in medical studies. Although MI is rather appealing in practice since it is possible to use ordinary statistical methods for a complete data set once the missing values are fully imputed, the method of imputation is still problematic. If the missing values are imputed from some parametric model, the validity of imputation is not necessarily ensured, and the final estimate for a parameter of interest can be biased unless the parametric model is correctly specified. Nonparametric methods have been also proposed for MI, but it is not so straightforward as to produce imputation values from nonparametrically estimated distributions. In this paper, we propose a new method for MI to obtain a consistent (or asymptotically unbiased) final estimate even if the imputation model is misspecified. The key idea is to use an imputation model from which the imputation values are easily produced and to make a proper correction in the likelihood function after the imputation by using the density ratio between the imputation model and the true conditional density function for the missing variable as a weight. Although the conditional density must be nonparametrically estimated, it is not used for the imputation. The performance of our method is evaluated by both theory and simulation studies. A real data analysis is also conducted to illustrate our method by using the Duke Cardiac Catheterization Coronary Artery Disease Diagnostic Dataset. Copyright © 2018 John Wiley & Sons, Ltd.

  13. Systematic Testing of Belief-Propagation Estimates for Absolute Free Energies in Atomistic Peptides and Proteins.

    PubMed

    Donovan-Maiye, Rory M; Langmead, Christopher J; Zuckerman, Daniel M

    2018-01-09

    Motivated by the extremely high computing costs associated with estimates of free energies for biological systems using molecular simulations, we further the exploration of existing "belief propagation" (BP) algorithms for fixed-backbone peptide and protein systems. The precalculation of pairwise interactions among discretized libraries of side-chain conformations, along with representation of protein side chains as nodes in a graphical model, enables direct application of the BP approach, which requires only ∼1 s of single-processor run time after the precalculation stage. We use a "loopy BP" algorithm, which can be seen as an approximate generalization of the transfer-matrix approach to highly connected (i.e., loopy) graphs, and it has previously been applied to protein calculations. We examine the application of loopy BP to several peptides as well as the binding site of the T4 lysozyme L99A mutant. The present study reports on (i) the comparison of the approximate BP results with estimates from unbiased estimators based on the Amber99SB force field; (ii) investigation of the effects of varying library size on BP predictions; and (iii) a theoretical discussion of the discretization effects that can arise in BP calculations. The data suggest that, despite their approximate nature, BP free-energy estimates are highly accurate-indeed, they never fall outside confidence intervals from unbiased estimators for the systems where independent results could be obtained. Furthermore, we find that libraries of sufficiently fine discretization (which diminish library-size sensitivity) can be obtained with standard computing resources in most cases. Altogether, the extremely low computing times and accurate results suggest the BP approach warrants further study.

  14. A brief update on physical and optical disector applications and sectioning-staining methods in neuroscience.

    PubMed

    Yurt, Kıymet Kübra; Kivrak, Elfide Gizem; Altun, Gamze; Mohamed, Hamza; Ali, Fathelrahman; Gasmalla, Hosam Eldeen; Kaplan, Suleyman

    2018-02-26

    A quantitative description of a three-dimensional (3D) object based on two-dimensional images can be made using stereological methods These methods involve unbiased approaches and provide reliable results with quantitative data. The quantitative morphology of the nervous system has been thoroughly researched in this context. In particular, various novel methods such as design-based stereological approaches have been applied in neuoromorphological studies. The main foundations of these methods are systematic random sampling and a 3D approach to structures such as tissues and organs. One key point in these methods is that selected samples should represent the entire structure. Quantification of neurons, i.e. particles, is important for revealing degrees of neurodegeneration and regeneration in an organ or system. One of the most crucial morphometric parameters in biological studies is thus the "number". The disector counting method introduced by Sterio in 1984 is an efficient and reliable solution for particle number estimation. In order to obtain precise results by means of stereological analysis, counting items should be seen clearly in the tissue. If an item in the tissue cannot be seen, these cannot be analyzed even using unbiased stereological techniques. Staining and sectioning processes therefore play a critical role in stereological analysis. The purpose of this review is to evaluate current neuroscientific studies using optical and physical disector counting methods and to discuss their definitions and methodological characteristics. Although the efficiency of the optical disector method in light microscopic studies has been revealed in recent years, the physical disector method is more easily performed in electron microscopic studies. Also, we offered to readers summaries of some common basic staining and sectioning methods, which can be used for stereological techniques in this review. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Responses of pond-breeding amphibians to wildfire: Short-term patterns in occupancy and colonization

    USGS Publications Warehouse

    Hossack, B.R.; Corn, P.S.

    2007-01-01

    Wildland fires are expected to become more frequent and severe in many ecosystems, potentially posing a threat to many sensitive species. We evaluated the effects of a large, stand-replacement wildfire on three species of pond-breeding amphibians by estimating changes in occupancy of breeding sites during the three years before and after the fire burned 42 of 83 previously surveyed wetlands. Annual occupancy and colonization for each species was estimated using recently developed models that incorporate detection probabilities to provide unbiased parameter estimates. We did not find negative effects of the fire on the occupancy or colonization rates of the long-toed salamander (Ambystoma macrodactylum). Instead, its occupancy was higher across the study area after the fire, possibly in response to a large snowpack that may have facilitated colonization of unoccupied wetlands. Naïve data (uncorrected for detection probability) for the Columbia spotted frog (Rana luteiventris) initially led to the conclusion of increased occupancy and colonization in wetlands that burned. After accounting for temporal and spatial variation in detection probabilities, however, it was evident that these parameters were relatively stable in both areas before and after the fire. We found a similar discrepancy between naïve and estimated occupancy of A. macrodactylum that resulted from different detection probabilities in burned and control wetlands. The boreal toad (Bufo boreas) was not found breeding in the area prior to the fire but colonized several wetlands the year after they burned. Occupancy by B. boreas then declined during years 2 and 3 following the fire. Our study suggests that the amphibian populations we studied are resistant to wildfire and that B. boreas may experience short-term benefits from wildfire. Our data also illustrate how naïve presence–non-detection data can provide misleading results.

  16. Data Series Subtraction with Unknown and Unmodeled Background Noise

    NASA Technical Reports Server (NTRS)

    Vitale, Stefano; Congedo, Giuseppe; Dolesi, Rita; Ferroni, Valerio; Hueller, Mauro; Vetrugno, Daniele; Weber, William Joseph; Audley, Heather; Danzmann, Karsten; Diepholz, Ingo; hide

    2014-01-01

    LISA Pathfinder (LPF), the precursor mission to a gravitational wave observatory of the European Space Agency, will measure the degree to which two test masses can be put into free fall, aiming to demonstrate a suppression of disturbance forces corresponding to a residual relative acceleration with a power spectral density (PSD) below (30 fm/sq s/Hz)(sup 2) around 1 mHz. In LPF data analysis, the disturbance forces are obtained as the difference between the acceleration data and a linear combination of other measured data series. In many circumstances, the coefficients for this linear combination are obtained by fitting these data series to the acceleration, and the disturbance forces appear then as the data series of the residuals of the fit. Thus the background noise or, more precisely, its PSD, whose knowledge is needed to build up the likelihood function in ordinary maximum likelihood fitting, is here unknown, and its estimate constitutes instead one of the goals of the fit. In this paper we present a fitting method that does not require the knowledge of the PSD of the background noise. The method is based on the analytical marginalization of the posterior parameter probability density with respect to the background noise PSD, and returns an estimate both for the fitting parameters and for the PSD. We show that both these estimates are unbiased, and that, when using averaged Welchs periodograms for the residuals, the estimate of the PSD is consistent, as its error tends to zero with the inverse square root of the number of averaged periodograms. Additionally, we find that the method is equivalent to some implementations of iteratively reweighted least-squares fitting. We have tested the method both on simulated data of known PSD and on data from several experiments performed with the LISA Pathfinder end-to-end mission simulator.

  17. Sampling in ecology and evolution - bridging the gap between theory and practice

    USGS Publications Warehouse

    Albert, C.H.; Yoccoz, N.G.; Edwards, T.C.; Graham, C.H.; Zimmermann, N.E.; Thuiller, W.

    2010-01-01

    Sampling is a key issue for answering most ecological and evolutionary questions. The importance of developing a rigorous sampling design tailored to specific questions has already been discussed in the ecological and sampling literature and has provided useful tools and recommendations to sample and analyse ecological data. However, sampling issues are often difficult to overcome in ecological studies due to apparent inconsistencies between theory and practice, often leading to the implementation of simplified sampling designs that suffer from unknown biases. Moreover, we believe that classical sampling principles which are based on estimation of means and variances are insufficient to fully address many ecological questions that rely on estimating relationships between a response and a set of predictor variables over time and space. Our objective is thus to highlight the importance of selecting an appropriate sampling space and an appropriate sampling design. We also emphasize the importance of using prior knowledge of the study system to estimate models or complex parameters and thus better understand ecological patterns and processes generating these patterns. Using a semi-virtual simulation study as an illustration we reveal how the selection of the space (e.g. geographic, climatic), in which the sampling is designed, influences the patterns that can be ultimately detected. We also demonstrate the inefficiency of common sampling designs to reveal response curves between ecological variables and climatic gradients. Further, we show that response-surface methodology, which has rarely been used in ecology, is much more efficient than more traditional methods. Finally, we discuss the use of prior knowledge, simulation studies and model-based designs in defining appropriate sampling designs. We conclude by a call for development of methods to unbiasedly estimate nonlinear ecologically relevant parameters, in order to make inferences while fulfilling requirements of both sampling theory and field work logistics. ?? 2010 The Authors.

  18. Empirical Benchmarks of Hidden Bias in Educational Research: Implication for Assessing How well Propensity Score Methods Approximate Experiments and Conducting Sensitivity Analysis

    ERIC Educational Resources Information Center

    Dong, Nianbo; Lipsey, Mark

    2014-01-01

    When randomized control trials (RCT) are not feasible, researchers seek other methods to make causal inference, e.g., propensity score methods. One of the underlined assumptions for the propensity score methods to obtain unbiased treatment effect estimates is the ignorability assumption, that is, conditional on the propensity score, treatment…

  19. The Performance of Multilevel Growth Curve Models under an Autoregressive Moving Average Process

    ERIC Educational Resources Information Center

    Murphy, Daniel L.; Pituch, Keenan A.

    2009-01-01

    The authors examined the robustness of multilevel linear growth curve modeling to misspecification of an autoregressive moving average process. As previous research has shown (J. Ferron, R. Dailey, & Q. Yi, 2002; O. Kwok, S. G. West, & S. B. Green, 2007; S. Sivo, X. Fan, & L. Witta, 2005), estimates of the fixed effects were unbiased, and Type I…

  20. Grade of Membership Response Time Model for Detecting Guessing Behaviors

    ERIC Educational Resources Information Center

    Pokropek, Artur

    2016-01-01

    A response model that is able to detect guessing behaviors and produce unbiased estimates in low-stake conditions using timing information is proposed. The model is a special case of the grade of membership model in which responses are modeled as partial members of a class that is affected by motivation and a class that responds only according to…

  1. The Long-Term Impacts of Teachers: Teacher Value-Added and Student Outcomes in Adulthood. NBER Working Paper No. 17699

    ERIC Educational Resources Information Center

    Chetty, Raj; Friedman, John N.; Rockoff, Jonah E.

    2011-01-01

    Are teachers' impacts on students' test scores ("value-added") a good measure of their quality? This question has sparked debate largely because of disagreement about (1) whether value-added (VA) provides unbiased estimates of teachers' impacts on student achievement and (2) whether high-VA teachers improve students' long-term outcomes.…

  2. Development of volume equations using data obtained by upper stem dendrometry with Monte Carlo integration: preliminary results for eastern redcedar

    Treesearch

    Thomas B. Lynch; Rodney E. Will; Rider Reynolds

    2013-01-01

    Preliminary results are given for development of an eastern redcedar (Juniperus virginiana) cubic-volume equation based on measurements of redcedar sample tree stem volume using dendrometry with Monte Carlo integration. Monte Carlo integration techniques can be used to provide unbiased estimates of stem cubic-foot volume based on upper stem diameter...

  3. Proof of Concept for an Approach to a Finer Resolution Inventory

    Treesearch

    Chris J. Cieszewski; Kim Iles; Roger C. Lowe; Michal Zasada

    2005-01-01

    This report presents a proof of concept for a statistical framework to develop a timely, accurate, and unbiased fiber supply assessment in the State of Georgia, U.S.A. The proposed approach is based on using various data sources and modeling techniques to calibrate satellite image-based statewide stand lists, which provide initial estimates for a State inventory on a...

  4. Should multiple imputation be the method of choice for handling missing data in randomized trials?

    PubMed Central

    Sullivan, Thomas R; White, Ian R; Salter, Amy B; Ryan, Philip; Lee, Katherine J

    2016-01-01

    The use of multiple imputation has increased markedly in recent years, and journal reviewers may expect to see multiple imputation used to handle missing data. However in randomized trials, where treatment group is always observed and independent of baseline covariates, other approaches may be preferable. Using data simulation we evaluated multiple imputation, performed both overall and separately by randomized group, across a range of commonly encountered scenarios. We considered both missing outcome and missing baseline data, with missing outcome data induced under missing at random mechanisms. Provided the analysis model was correctly specified, multiple imputation produced unbiased treatment effect estimates, but alternative unbiased approaches were often more efficient. When the analysis model overlooked an interaction effect involving randomized group, multiple imputation produced biased estimates of the average treatment effect when applied to missing outcome data, unless imputation was performed separately by randomized group. Based on these results, we conclude that multiple imputation should not be seen as the only acceptable way to handle missing data in randomized trials. In settings where multiple imputation is adopted, we recommend that imputation is carried out separately by randomized group. PMID:28034175

  5. Should multiple imputation be the method of choice for handling missing data in randomized trials?

    PubMed

    Sullivan, Thomas R; White, Ian R; Salter, Amy B; Ryan, Philip; Lee, Katherine J

    2016-01-01

    The use of multiple imputation has increased markedly in recent years, and journal reviewers may expect to see multiple imputation used to handle missing data. However in randomized trials, where treatment group is always observed and independent of baseline covariates, other approaches may be preferable. Using data simulation we evaluated multiple imputation, performed both overall and separately by randomized group, across a range of commonly encountered scenarios. We considered both missing outcome and missing baseline data, with missing outcome data induced under missing at random mechanisms. Provided the analysis model was correctly specified, multiple imputation produced unbiased treatment effect estimates, but alternative unbiased approaches were often more efficient. When the analysis model overlooked an interaction effect involving randomized group, multiple imputation produced biased estimates of the average treatment effect when applied to missing outcome data, unless imputation was performed separately by randomized group. Based on these results, we conclude that multiple imputation should not be seen as the only acceptable way to handle missing data in randomized trials. In settings where multiple imputation is adopted, we recommend that imputation is carried out separately by randomized group.

  6. Estimating unbiased magnitudes for the announced DPRK nuclear tests, 2006-2016

    NASA Astrophysics Data System (ADS)

    Peacock, Sheila; Bowers, David

    2017-04-01

    The seismic disturbances generated from the five (2006-2016) announced nuclear test explosions by the Democratic People's Republic of Korea (DPRK) are of moderate magnitude (body-wave magnitude mb 4-5) by global earthquake standards. An upward bias of network mean mb of low- to moderate-magnitude events is long established, and is caused by the censoring of readings from stations where the signal was below noise level at the time of the predicted arrival. This sampling bias can be overcome by maximum-likelihood methods using station thresholds at detecting (and non-detecting) stations. Bias in the mean mb can also be introduced by differences in the network of stations recording each explosion - this bias can reduced by using station corrections. We apply a maximum-likelihood (JML) inversion that jointly estimates station corrections and unbiased network mb for the five DPRK explosions recorded by the CTBTO International Monitoring Network (IMS) of seismic stations. The thresholds can either be directly measured from the noise preceding the observed signal, or determined by statistical analysis of bulletin amplitudes. The network mb of the first and smallest explosion is reduced significantly relative to the mean mb (to < 4.0 mb) by removal of the censoring bias.

  7. MRMC analysis of agreement studies

    NASA Astrophysics Data System (ADS)

    Gallas, Brandon D.; Anam, Amrita; Chen, Weijie; Wunderlich, Adam; Zhang, Zhiwei

    2016-03-01

    The purpose of this work is to present and evaluate methods based on U-statistics to compare intra- or inter-reader agreement across different imaging modalities. We apply these methods to multi-reader multi-case (MRMC) studies. We measure reader-averaged agreement and estimate its variance accounting for the variability from readers and cases (an MRMC analysis). In our application, pathologists (readers) evaluate patient tissue mounted on glass slides (cases) in two ways. They evaluate the slides on a microscope (reference modality) and they evaluate digital scans of the slides on a computer display (new modality). In the current work, we consider concordance as the agreement measure, but many of the concepts outlined here apply to other agreement measures. Concordance is the probability that two readers rank two cases in the same order. Concordance can be estimated with a U-statistic and thus it has some nice properties: it is unbiased, asymptotically normal, and its variance is given by an explicit formula. Another property of a U-statistic is that it is symmetric in its inputs; it doesn't matter which reader is listed first or which case is listed first, the result is the same. Using this property and a few tricks while building the U-statistic kernel for concordance, we get a mathematically tractable problem and efficient software. Simulations show that our variance and covariance estimates are unbiased.

  8. Phenology of Scramble Polygyny in a Wild Population of Chrysolemid Beetles: The Opportunity for and the Strength of Sexual Selection

    PubMed Central

    Baena, Martha Lucía; Macías-Ordóñez, Rogelio

    2012-01-01

    Recent debate has highlighted the importance of estimating both the strength of sexual selection on phenotypic traits, and the opportunity for sexual selection. We describe seasonal fluctuations in mating dynamics of Leptinotarsa undecimlineata (Coleoptera: Chrysomelidae). We compared several estimates of the opportunity for, and the strength of, sexual selection and male precopulatory competition over the reproductive season. First, using a null model, we suggest that the ratio between observed values of the opportunity for sexual selections and their expected value under random mating results in unbiased estimates of the actual nonrandom mating behavior of the population. Second, we found that estimates for the whole reproductive season often misrepresent the actual value at any given time period. Third, mating differentials on male size and mobility, frequency of male fighting and three estimates of the opportunity for sexual selection provide contrasting but complementary information. More intense sexual selection associated to male mobility, but not to male size, was observed in periods with high opportunity for sexual selection and high frequency of male fights. Fourth, based on parameters of spatial and temporal aggregation of female receptivity, we describe the mating system of L. undecimlineata as a scramble mating polygyny in which the opportunity for sexual selection varies widely throughout the season, but the strength of sexual selection on male size remains fairly weak, while male mobility inversely covaries with mating success. We suggest that different estimates for the opportunity for, and intensity of, sexual selection should be applied in order to discriminate how different behavioral and demographic factors shape the reproductive dynamic of populations. PMID:22761675

  9. Cost-sensitive AdaBoost algorithm for ordinal regression based on extreme learning machine.

    PubMed

    Riccardi, Annalisa; Fernández-Navarro, Francisco; Carloni, Sante

    2014-10-01

    In this paper, the well known stagewise additive modeling using a multiclass exponential (SAMME) boosting algorithm is extended to address problems where there exists a natural order in the targets using a cost-sensitive approach. The proposed ensemble model uses an extreme learning machine (ELM) model as a base classifier (with the Gaussian kernel and the additional regularization parameter). The closed form of the derived weighted least squares problem is provided, and it is employed to estimate analytically the parameters connecting the hidden layer to the output layer at each iteration of the boosting algorithm. Compared to the state-of-the-art boosting algorithms, in particular those using ELM as base classifier, the suggested technique does not require the generation of a new training dataset at each iteration. The adoption of the weighted least squares formulation of the problem has been presented as an unbiased and alternative approach to the already existing ELM boosting techniques. Moreover, the addition of a cost model for weighting the patterns, according to the order of the targets, enables the classifier to tackle ordinal regression problems further. The proposed method has been validated by an experimental study by comparing it with already existing ensemble methods and ELM techniques for ordinal regression, showing competitive results.

  10. Large Area Crop Inventory Experiment (LACIE). Development of procedure M for multicrop inventory, with tests of a spring-wheat configuration

    NASA Technical Reports Server (NTRS)

    Horvath, R. (Principal Investigator); Cicone, R.; Crist, E.; Kauth, R. J.; Lambeck, P.; Malila, W. A.; Richardson, W.

    1979-01-01

    The author has identified the following significant results. An outgrowth of research and development activities in support of LACIE was a multicrop area estimation procedure, Procedure M. This procedure was a flexible, modular system that could be operated within the LACIE framework. Its distinctive features were refined preprocessing (including spatially varying correction for atmospheric haze), definition of field like spatial features for labeling, spectral stratification, and unbiased selection of samples to label and crop area estimation without conventional maximum likelihood classification.

  11. Objective Analysis of Oceanic Data for Coast Guard Trajectory Models Phase II

    DTIC Science & Technology

    1997-12-01

    as outliers depends on the desired probability of false alarm, Pfa values, which is the probability of marking a valid point as an outlier. Table 2-2...constructed to minimize the mean-squared prediction error of the grid point estimate under the constraint that the estimate is unbiased . The...prediction error, e= Zl(S) _oizl(Si)+oC1iZz(S) (2.44) subject to the constraints of unbiasedness , • c/1 = 1,and (2.45) i SCC12 = 0. (2.46) Denoting

  12. Comparing models of the periodic variations in spin-down and beamwidth for PSR B1828-11

    NASA Astrophysics Data System (ADS)

    Ashton, G.; Jones, D. I.; Prix, R.

    2016-05-01

    We build a framework using tools from Bayesian data analysis to evaluate models explaining the periodic variations in spin-down and beamwidth of PSR B1828-11. The available data consist of the time-averaged spin-down rate, which displays a distinctive double-peaked modulation, and measurements of the beamwidth. Two concepts exist in the literature that are capable of explaining these variations; we formulate predictive models from these and quantitatively compare them. The first concept is phenomenological and stipulates that the magnetosphere undergoes periodic switching between two metastable states as first suggested by Lyne et al. The second concept, precession, was first considered as a candidate for the modulation of B1828-11 by Stairs et al. We quantitatively compare models built from these concepts using a Bayesian odds ratio. Because the phenomenological switching model itself was informed by these data in the first place, it is difficult to specify appropriate parameter-space priors that can be trusted for an unbiased model comparison. Therefore, we first perform a parameter estimation using the spin-down data, and then use the resulting posterior distributions as priors for model comparison on the beamwidth data. We find that a precession model with a simple circular Gaussian beam geometry fails to appropriately describe the data, while allowing for a more general beam geometry provides a good fit to the data. The resulting odds between the precession model (with a general beam geometry) and the switching model are estimated as 102.7±0.5 in favour of the precession model.

  13. Three statistical models for estimating length of stay.

    PubMed Central

    Selvin, S

    1977-01-01

    The probability density functions implied by three methods of collecting data on the length of stay in an institution are derived. The expected values associated with these density functions are used to calculate unbiased estimates of the expected length of stay. Two of the methods require an assumption about the form of the underlying distribution of length of stay; the third method does not. The three methods are illustrated with hypothetical data exhibiting the Poisson distribution, and the third (distribution-independent) method is used to estimate the length of stay in a skilled nursing facility and in an intermediate care facility for patients enrolled in California's MediCal program. PMID:914532

  14. Three statistical models for estimating length of stay.

    PubMed

    Selvin, S

    1977-01-01

    The probability density functions implied by three methods of collecting data on the length of stay in an institution are derived. The expected values associated with these density functions are used to calculate unbiased estimates of the expected length of stay. Two of the methods require an assumption about the form of the underlying distribution of length of stay; the third method does not. The three methods are illustrated with hypothetical data exhibiting the Poisson distribution, and the third (distribution-independent) method is used to estimate the length of stay in a skilled nursing facility and in an intermediate care facility for patients enrolled in California's MediCal program.

  15. Modeling longitudinal data, I: principles of multivariate analysis.

    PubMed

    Ravani, Pietro; Barrett, Brendan; Parfrey, Patrick

    2009-01-01

    Statistical models are used to study the relationship between exposure and disease while accounting for the potential role of other factors' impact on outcomes. This adjustment is useful to obtain unbiased estimates of true effects or to predict future outcomes. Statistical models include a systematic component and an error component. The systematic component explains the variability of the response variable as a function of the predictors and is summarized in the effect estimates (model coefficients). The error element of the model represents the variability in the data unexplained by the model and is used to build measures of precision around the point estimates (confidence intervals).

  16. Conformational free energy modeling of druglike molecules by metadynamics in the WHIM space.

    PubMed

    Spiwok, Vojtěch; Hlat-Glembová, Katarína; Tvaroška, Igor; Králová, Blanka

    2012-03-26

    Protein-ligand affinities can be significantly influenced not only by the interaction itself but also by conformational equilibrium of both binding partners, free ligand and free protein. Identification of important conformational families of a ligand and prediction of their thermodynamics is important for efficient ligand design. Here we report conformational free energy modeling of nine small-molecule drugs in explicitly modeled water by metadynamics with a bias potential applied in the space of weighted holistic invariant molecular (WHIM) descriptors. Application of metadynamics enhances conformational sampling compared to unbiased molecular dynamics simulation and allows to predict relative free energies of key conformations. Selected free energy minima and one example of transition state were tested by a series of unbiased molecular dynamics simulation. Comparison of free energy surfaces of free and target-bound Imatinib provides an estimate of free energy penalty of conformational change induced by its binding to the target. © 2012 American Chemical Society

  17. Reconciling biases and uncertainties of AIRS and MODIS ice cloud properties

    NASA Astrophysics Data System (ADS)

    Kahn, B. H.; Gettelman, A.

    2015-12-01

    We will discuss comparisons of collocated Atmospheric Infrared Sounder (AIRS) and Moderate Resolution Imaging Spectroradiometer (MODIS) ice cloud optical thickness (COT), effective radius (CER), and cloud thermodynamic phase retrievals. The ice cloud comparisons are stratified by retrieval uncertainty estimates, horizontal inhomogeneity at the pixel-scale, vertical cloud structure, and other key parameters. Although an estimated 27% globally of all AIRS pixels contain ice cloud, only 7% of them are spatially uniform ice according to MODIS. We find that the correlations of COT and CER between the two instruments are strong functions of horizontal cloud heterogeneity and vertical cloud structure. The best correlations are found in single-layer, horizontally homogeneous clouds over the low-latitude tropical oceans with biases and scatter that increase with scene complexity. While the COT comparisons are unbiased in homogeneous ice clouds, a bias of 5-10 microns remains in CER within the most homogeneous scenes identified. This behavior is entirely consistent with known sensitivity differences in the visible and infrared bands. We will use AIRS and MODIS ice cloud properties to evaluate ice hydrometeor output from climate model output, such as the CAM5, with comparisons sorted into different dynamical regimes. The results of the regime-dependent comparisons will be described and implications for model evaluation and future satellite observational needs will be discussed.

  18. Mutually unbiased projectors and duality between lines and bases in finite quantum systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shalaby, M.; Vourdas, A., E-mail: a.vourdas@bradford.ac.uk

    2013-10-15

    Quantum systems with variables in the ring Z(d) are considered, and the concepts of weak mutually unbiased bases and mutually unbiased projectors are discussed. The lines through the origin in the Z(d)×Z(d) phase space, are classified into maximal lines (sets of d points), and sublines (sets of d{sub i} points where d{sub i}|d). The sublines are intersections of maximal lines. It is shown that there exists a duality between the properties of lines (resp., sublines), and the properties of weak mutually unbiased bases (resp., mutually unbiased projectors). -- Highlights: •Lines in discrete phase space. •Bases in finite quantum systems. •Dualitymore » between bases and lines. •Weak mutually unbiased bases.« less

  19. Demonstration of line transect methodologies to estimate urban gray squirrel density

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hein, E.W.

    1997-11-01

    Because studies estimating density of gray squirrels (Sciurus carolinensis) have been labor intensive and costly, I demonstrate the use of line transect surveys to estimate gray squirrel density and determine the costs of conducting surveys to achieve precise estimates. Density estimates are based on four transacts that were surveyed five times from 30 June to 9 July 1994. Using the program DISTANCE, I estimated there were 4.7 (95% Cl = 1.86-11.92) gray squirrels/ha on the Clemson University campus. Eleven additional surveys would have decreased the percent coefficient of variation from 30% to 20% and would have cost approximately $114. Estimatingmore » urban gray squirrel density using line transect surveys is cost effective and can provide unbiased estimates of density, provided that none of the assumptions of distance sampling theory are violated.« less

  20. Beyond total treatment effects in randomised controlled trials: Baseline measurement of intermediate outcomes needed to reduce confounding in mediation investigations.

    PubMed

    Landau, Sabine; Emsley, Richard; Dunn, Graham

    2018-06-01

    Random allocation avoids confounding bias when estimating the average treatment effect. For continuous outcomes measured at post-treatment as well as prior to randomisation (baseline), analyses based on (A) post-treatment outcome alone, (B) change scores over the treatment phase or (C) conditioning on baseline values (analysis of covariance) provide unbiased estimators of the average treatment effect. The decision to include baseline values of the clinical outcome in the analysis is based on precision arguments, with analysis of covariance known to be most precise. Investigators increasingly carry out explanatory analyses to decompose total treatment effects into components that are mediated by an intermediate continuous outcome and a non-mediated part. Traditional mediation analysis might be performed based on (A) post-treatment values of the intermediate and clinical outcomes alone, (B) respective change scores or (C) conditioning on baseline measures of both intermediate and clinical outcomes. Using causal diagrams and Monte Carlo simulation, we investigated the performance of the three competing mediation approaches. We considered a data generating model that included three possible confounding processes involving baseline variables: The first two processes modelled baseline measures of the clinical variable or the intermediate variable as common causes of post-treatment measures of these two variables. The third process allowed the two baseline variables themselves to be correlated due to past common causes. We compared the analysis models implied by the competing mediation approaches with this data generating model to hypothesise likely biases in estimators, and tested these in a simulation study. We applied the methods to a randomised trial of pragmatic rehabilitation in patients with chronic fatigue syndrome, which examined the role of limiting activities as a mediator. Estimates of causal mediation effects derived by approach (A) will be biased if one of the three processes involving baseline measures of intermediate or clinical outcomes is operating. Necessary assumptions for the change score approach (B) to provide unbiased estimates under either process include the independence of baseline measures and change scores of the intermediate variable. Finally, estimates provided by the analysis of covariance approach (C) were found to be unbiased under all the three processes considered here. When applied to the example, there was evidence of mediation under all methods but the estimate of the indirect effect depended on the approach used with the proportion mediated varying from 57% to 86%. Trialists planning mediation analyses should measure baseline values of putative mediators as well as of continuous clinical outcomes. An analysis of covariance approach is recommended to avoid potential biases due to confounding processes involving baseline measures of intermediate or clinical outcomes, and not simply for increased precision.

  1. Impact of rough potentials in rocked ratchet performance

    NASA Astrophysics Data System (ADS)

    Camargo, S.; Anteneodo, C.

    2018-04-01

    We consider thermal ratchets modeled by overdamped Brownian motion in a spatially periodic potential with a tilting process, both unbiased on average. We investigate the impact of the introduction of roughness in the potential profile, over the flux and efficiency of the ratchet. Both amplitude and wavelength that characterize roughness are varied. We show that depending on the ratchet parameters, rugosity can either spoil or enhance the ratchet performance.

  2. Hydrodynamic interactions induce movement against an external load in a ratchet dimer Brownian motor.

    PubMed

    Fornés, José A

    2010-01-15

    We use the Brownian dynamics with hydrodynamic interactions simulation in order to describe the movement of a elastically coupled dimer Brownian motor in a ratchet potential. The only external forces considered in our system were the load, the random thermal noise and an unbiased thermal fluctuation. For a given set of parameters we observe direct movement against the load force if hydrodynamic interactions were considered.

  3. Filter design for the detection of compact sources based on the Neyman-Pearson detector

    NASA Astrophysics Data System (ADS)

    López-Caniego, M.; Herranz, D.; Barreiro, R. B.; Sanz, J. L.

    2005-05-01

    This paper considers the problem of compact source detection on a Gaussian background. We present a one-dimensional treatment (though a generalization to two or more dimensions is possible). Two relevant aspects of this problem are considered: the design of the detector and the filtering of the data. Our detection scheme is based on local maxima and it takes into account not only the amplitude but also the curvature of the maxima. A Neyman-Pearson test is used to define the region of acceptance, which is given by a sufficient linear detector that is independent of the amplitude distribution of the sources. We study how detection can be enhanced by means of linear filters with a scaling parameter, and compare some filters that have been proposed in the literature [the Mexican hat wavelet, the matched filter (MF) and the scale-adaptive filter (SAF)]. We also introduce a new filter, which depends on two free parameters (the biparametric scale-adaptive filter, BSAF). The value of these two parameters can be determined, given the a priori probability density function of the amplitudes of the sources, such that the filter optimizes the performance of the detector in the sense that it gives the maximum number of real detections once it has fixed the number density of spurious sources. The new filter includes as particular cases the standard MF and the SAF. As a result of its design, the BSAF outperforms these filters. The combination of a detection scheme that includes information on the curvature and a flexible filter that incorporates two free parameters (one of them a scaling parameter) improves significantly the number of detections in some interesting cases. In particular, for the case of weak sources embedded in white noise, the improvement with respect to the standard MF is of the order of 40 per cent. Finally, an estimation of the amplitude of the source (most probable value) is introduced and it is proven that such an estimator is unbiased and has maximum efficiency. We perform numerical simulations to test these theoretical ideas in a practical example and conclude that the results of the simulations agree with the analytical results.

  4. Estimating Gravity Biases with Wavelets in Support of a 1-cm Accurate Geoid Model

    NASA Astrophysics Data System (ADS)

    Ahlgren, K.; Li, X.

    2017-12-01

    Systematic errors that reside in surface gravity datasets are one of the major hurdles in constructing a high-accuracy geoid model at high resolutions. The National Oceanic and Atmospheric Administration's (NOAA) National Geodetic Survey (NGS) has an extensive historical surface gravity dataset consisting of approximately 10 million gravity points that are known to have systematic biases at the mGal level (Saleh et al. 2013). As most relevant metadata is absent, estimating and removing these errors to be consistent with a global geopotential model and airborne data in the corresponding wavelength is quite a difficult endeavor. However, this is crucial to support a 1-cm accurate geoid model for the United States. With recently available independent gravity information from GRACE/GOCE and airborne gravity from the NGS Gravity for the Redefinition of the American Vertical Datum (GRAV-D) project, several different methods of bias estimation are investigated which utilize radial basis functions and wavelet decomposition. We estimate a surface gravity value by incorporating a satellite gravity model, airborne gravity data, and forward-modeled topography at wavelet levels according to each dataset's spatial wavelength. Considering the estimated gravity values over an entire gravity survey, an estimate of the bias and/or correction for the entire survey can be found and applied. In order to assess the accuracy of each bias estimation method, two techniques are used. First, each bias estimation method is used to predict the bias for two high-quality (unbiased and high accuracy) geoid slope validation surveys (GSVS) (Smith et al. 2013 & Wang et al. 2017). Since these surveys are unbiased, the various bias estimation methods should reflect that and provide an absolute accuracy metric for each of the bias estimation methods. Secondly, the corrected gravity datasets from each of the bias estimation methods are used to build a geoid model. The accuracy of each geoid model provides an additional metric to assess the performance of each bias estimation method. The geoid model accuracies are assessed using the two GSVS lines and GPS-leveling data across the United States.

  5. Methods for fitting a parametric probability distribution to most probable number data.

    PubMed

    Williams, Michael S; Ebel, Eric D

    2012-07-02

    Every year hundreds of thousands, if not millions, of samples are collected and analyzed to assess microbial contamination in food and water. The concentration of pathogenic organisms at the end of the production process is low for most commodities, so a highly sensitive screening test is used to determine whether the organism of interest is present in a sample. In some applications, samples that test positive are subjected to quantitation. The most probable number (MPN) technique is a common method to quantify the level of contamination in a sample because it is able to provide estimates at low concentrations. This technique uses a series of dilution count experiments to derive estimates of the concentration of the microorganism of interest. An application for these data is food-safety risk assessment, where the MPN concentration estimates can be fitted to a parametric distribution to summarize the range of potential exposures to the contaminant. Many different methods (e.g., substitution methods, maximum likelihood and regression on order statistics) have been proposed to fit microbial contamination data to a distribution, but the development of these methods rarely considers how the MPN technique influences the choice of distribution function and fitting method. An often overlooked aspect when applying these methods is whether the data represent actual measurements of the average concentration of microorganism per milliliter or the data are real-valued estimates of the average concentration, as is the case with MPN data. In this study, we propose two methods for fitting MPN data to a probability distribution. The first method uses a maximum likelihood estimator that takes average concentration values as the data inputs. The second is a Bayesian latent variable method that uses the counts of the number of positive tubes at each dilution to estimate the parameters of the contamination distribution. The performance of the two fitting methods is compared for two data sets that represent Salmonella and Campylobacter concentrations on chicken carcasses. The results demonstrate a bias in the maximum likelihood estimator that increases with reductions in average concentration. The Bayesian method provided unbiased estimates of the concentration distribution parameters for all data sets. We provide computer code for the Bayesian fitting method. Published by Elsevier B.V.

  6. Assessing performance of closed-loop insulin delivery systems by continuous glucose monitoring: drawbacks and way forward.

    PubMed

    Hovorka, Roman; Nodale, Marianna; Haidar, Ahmad; Wilinska, Malgorzata E

    2013-01-01

    We investigated whether continuous glucose monitoring (CGM) levels can accurately assess glycemic control while directing closed-loop insulin delivery. Data were analyzed retrospectively from 33 subjects with type 1 diabetes who underwent closed-loop and conventional pump therapy on two separate nights. Glycemic control was evaluated by reference plasma glucose and contrasted against three methods based on Navigator (Abbott Diabetes Care, Alameda, CA) CGM levels. Glucose mean and variability were estimated by unmodified CGM levels with acceptable clinical accuracy. Time when glucose was in target range was overestimated by CGM during closed-loop nights (CGM vs. plasma glucose median [interquartile range], 86% [65-97%] vs. 75% [59-91%]; P=0.04) but not during conventional pump therapy (57% [32-72%] vs. 51% [29-68%]; P=0.82) providing comparable treatment effect (mean [SD], 28% [29%] vs. 23% [21%]; P=0.11). Using the CGM measurement error of 15% derived from plasma glucose-CGM pairs (n=4,254), stochastic interpretation of CGM gave unbiased estimate of time in target during both closed-loop (79% [62-86%] vs. 75% [59-91%]; P=0.24) and conventional pump therapy (54% [33-66%] vs. 51% [29-68%]; P=0.44). Treatment effect (23% [24%] vs. 23% [21%]; P=0.96) and time below target were accurately estimated by stochastic CGM. Recalibrating CGM using reference plasma glucose values taken at the start and end of overnight closed-loop was not superior to stochastic CGM. CGM is acceptable to estimate glucose mean and variability, but without adjustment it may overestimate benefit of closed-loop. Stochastic CGM provided unbiased estimate of time when glucose is in target and below target and may be acceptable for assessment of closed-loop in the outpatient setting.

  7. Unbounding the mental number line—new evidence on children's spatial representation of numbers

    PubMed Central

    Link, Tanja; Huber, Stefan; Nuerk, Hans-Christoph; Moeller, Korbinian

    2014-01-01

    Number line estimation (i.e., indicating the position of a given number on a physical line) is a standard assessment of children's spatial representation of number magnitude. Importantly, there is an ongoing debate on the question in how far the bounded task version with start and endpoint given (e.g., 0 and 100) might induce specific estimation strategies and thus may not allow for unbiased inferences on the underlying representation. Recently, a new unbounded version of the task was suggested with only the start point and a unit fixed (e.g., the distance from 0 to 1). In adults this task provided a less biased index of the spatial representation of number magnitude. Yet, so far there are no children data available for the unbounded number line estimation task. Therefore, we conducted a cross-sectional study on primary school children performing both, the bounded and the unbounded version of the task. We observed clear evidence for systematic strategic influences (i.e., the consideration of reference points) in the bounded number line estimation task for children older than grade two whereas there were no such indications for the unbounded version for any one of the age groups. In summary, the current data corroborate the unbounded number line estimation task to be a valuable tool for assessing children's spatial representation of number magnitude in a systematic and unbiased manner. Yet, similar results for the bounded and the unbounded version of the task for first- and second-graders may indicate that both versions of the task might assess the same underlying representation for relatively younger children—at least in number ranges familiar to the children assessed. This is of particular importance for inferences about the nature and development of children's magnitude representation. PMID:24478734

  8. Shared Dosimetry Error in Epidemiological Dose-Response Analyses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stram, Daniel O.; Preston, Dale L.; Sokolnikov, Mikhail

    2015-03-23

    Radiation dose reconstruction systems for large-scale epidemiological studies are sophisticated both in providing estimates of dose and in representing dosimetry uncertainty. For example, a computer program was used by the Hanford Thyroid Disease Study to provide 100 realizations of possible dose to study participants. The variation in realizations reflected the range of possible dose for each cohort member consistent with the data on dose determinates in the cohort. Another example is the Mayak Worker Dosimetry System 2013 which estimates both external and internal exposures and provides multiple realizations of "possible" dose history to workers given dose determinants. This paper takesmore » up the problem of dealing with complex dosimetry systems that provide multiple realizations of dose in an epidemiologic analysis. In this paper we derive expected scores and the information matrix for a model used widely in radiation epidemiology, namely the linear excess relative risk (ERR) model that allows for a linear dose response (risk in relation to radiation) and distinguishes between modifiers of background rates and of the excess risk due to exposure. We show that treating the mean dose for each individual (calculated by averaging over the realizations) as if it was true dose (ignoring both shared and unshared dosimetry errors) gives asymptotically unbiased estimates (i.e. the score has expectation zero) and valid tests of the null hypothesis that the ERR slope β is zero. Although the score is unbiased the information matrix (and hence the standard errors of the estimate of β) is biased for β≠0 when ignoring errors in dose estimates, and we show how to adjust the information matrix to remove this bias, using the multiple realizations of dose. Use of these methods for several studies, including the Mayak Worker Cohort and the U.S. Atomic Veterans Study, is discussed.« less

  9. Automated systematic random sampling and Cavalieri stereology of histologic sections demonstrating acute tubular necrosis after cardiac arrest and cardiopulmonary resuscitation in the mouse.

    PubMed

    Wakasaki, Rumie; Eiwaz, Mahaba; McClellan, Nicholas; Matsushita, Katsuyuki; Golgotiu, Kirsti; Hutchens, Michael P

    2018-06-14

    A technical challenge in translational models of kidney injury is determination of the extent of cell death. Histologic sections are commonly analyzed by area morphometry or unbiased stereology, but stereology requires specialized equipment. Therefore, a challenge to rigorous quantification would be addressed by an unbiased stereology tool with reduced equipment dependence. We hypothesized that it would be feasible to build a novel software component which would facilitate unbiased stereologic quantification on scanned slides, and that unbiased stereology would demonstrate greater precision and decreased bias compared with 2D morphometry. We developed a macro for the widely used image analysis program, Image J, and performed cardiac arrest with cardiopulmonary resuscitation (CA/CPR, a model of acute cardiorenal syndrome) in mice. Fluorojade-B stained kidney sections were analyzed using three methods to quantify cell death: gold standard stereology using a controlled stage and commercially-available software, unbiased stereology using the novel ImageJ macro, and quantitative 2D morphometry also using the novel macro. There was strong agreement between both methods of unbiased stereology (bias -0.004±0.006 with 95% limits of agreement -0.015 to 0.007). 2D morphometry demonstrated poor agreement and significant bias compared to either method of unbiased stereology. Unbiased stereology is facilitated by a novel macro for ImageJ and results agree with those obtained using gold-standard methods. Automated 2D morphometry overestimated tubular epithelial cell death and correlated modestly with values obtained from unbiased stereology. These results support widespread use of unbiased stereology for analysis of histologic outcomes of injury models.

  10. Entropic uncertainty relations and locking: Tight bounds for mutually unbiased bases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ballester, Manuel A.; Wehner, Stephanie

    We prove tight entropic uncertainty relations for a large number of mutually unbiased measurements. In particular, we show that a bound derived from the result by Maassen and Uffink [Phys. Rev. Lett. 60, 1103 (1988)] for two such measurements can in fact be tight for up to {radical}(d) measurements in mutually unbiased bases. We then show that using more mutually unbiased bases does not always lead to a better locking effect. We prove that the optimal bound for the accessible information using up to {radical}(d) specific mutually unbiased bases is log d/2, which is the same as can be achievedmore » by using only two bases. Our result indicates that merely using mutually unbiased bases is not sufficient to achieve a strong locking effect and we need to look for additional properties.« less

  11. Cubature/ Unscented/ Sigma Point Kalman Filtering with Angular Measurement Models

    DTIC Science & Technology

    2015-07-06

    Cubature/ Unscented/ Sigma Point Kalman Filtering with Angular Measurement Models David Frederic Crouse Naval Research Laboratory 4555 Overlook Ave...measurement and process non- linearities, such as the cubature Kalman filter , can perform ex- tremely poorly in many applications involving angular... Kalman filtering is a realization of the best linear unbiased estimator (BLUE) that evaluates certain integrals for expected values using different forms

  12. Development of a national forest inventory for carbon accounting purposes in New Zealand's planted Kyoto forests

    Treesearch

    John Moore; Ian Payton; Larry Burrows; Chris Goulding; Peter Beets; Paul Lane; Peter Stephens

    2007-01-01

    This article discusses the development of a monitoring system to estimate carbon sequestration in New Zealand's planted Kyoto forests, those forests that have been planted since January 1, 1990, on land that previously did not contain forest. The system must meet the Intergovernmental Panel on Climate Change good practice guidance and must be seen to be unbiased,...

  13. Correcting Four Similar Correlational Measures for Attenuation Due to Errors of Measurement in the Dependent Variable: Eta, Epsilon, Omega, and Intraclass r.

    ERIC Educational Resources Information Center

    Stanley, Julian C.; Livingston, Samuel A.

    Besides the ubiquitous Pearson product-moment r, there are a number of other measures of relationship that are attenuated by errors of measurement and for which the relationship between true measures can be estimated. Among these are the correlation ratio (eta squared), Kelley's unbiased correlation ratio (epsilon squared), Hays' omega squared,…

  14. Estimation and application of a growth and yield model for uneven-aged mixed conifer stands in California.

    Treesearch

    Jingjing Liang; J. Buongiorno; R.A. Monserud

    2005-01-01

    A growth model for uneven-aged mixed-conifer stands in California was developed with data from 205 permanent plots. The model predicts the number of softwood and hardwood trees in nineteen diameter classes, based on equations for diameter growth rates, mortality arid recruitment. The model gave unbiased predictions of the expected number of trees by diameter class and...

  15. Evaluating disease management programme effectiveness: an introduction to instrumental variables.

    PubMed

    Linden, Ariel; Adams, John L

    2006-04-01

    This paper introduces the concept of instrumental variables (IVs) as a means of providing an unbiased estimate of treatment effects in evaluating disease management (DM) programme effectiveness. Model development is described using zip codes as the IV. Three diabetes DM outcomes were evaluated: annual diabetes costs, emergency department (ED) visits and hospital days. Both ordinary least squares (OLS) and IV estimates showed a significant treatment effect for diabetes costs (P = 0.011) but neither model produced a significant treatment effect for ED visits. However, the IV estimate showed a significant treatment effect for hospital days (P = 0.006) whereas the OLS model did not. These results illustrate the utility of IV estimation when the OLS model is sensitive to the confounding effect of hidden bias.

  16. The effects of sample size on population genomic analyses--implications for the tests of neutrality.

    PubMed

    Subramanian, Sankar

    2016-02-20

    One of the fundamental measures of molecular genetic variation is the Watterson's estimator (θ), which is based on the number of segregating sites. The estimation of θ is unbiased only under neutrality and constant population growth. It is well known that the estimation of θ is biased when these assumptions are violated. However, the effects of sample size in modulating the bias was not well appreciated. We examined this issue in detail based on large-scale exome data and robust simulations. Our investigation revealed that sample size appreciably influences θ estimation and this effect was much higher for constrained genomic regions than that of neutral regions. For instance, θ estimated for synonymous sites using 512 human exomes was 1.9 times higher than that obtained using 16 exomes. However, this difference was 2.5 times for the nonsynonymous sites of the same data. We observed a positive correlation between the rate of increase in θ estimates (with respect to the sample size) and the magnitude of selection pressure. For example, θ estimated for the nonsynonymous sites of highly constrained genes (dN/dS < 0.1) using 512 exomes was 3.6 times higher than that estimated using 16 exomes. In contrast this difference was only 2 times for the less constrained genes (dN/dS > 0.9). The results of this study reveal the extent of underestimation owing to small sample sizes and thus emphasize the importance of sample size in estimating a number of population genomic parameters. Our results have serious implications for neutrality tests such as Tajima D, Fu-Li D and those based on the McDonald and Kreitman test: Neutrality Index and the fraction of adaptive substitutions. For instance, use of 16 exomes produced 2.4 times higher proportion of adaptive substitutions compared to that obtained using 512 exomes (24% vs 10 %).

  17. Analysis of the Bayesian Cramér-Rao lower bound in astrometry. Studying the impact of prior information in the location of an object

    NASA Astrophysics Data System (ADS)

    Echeverria, Alex; Silva, Jorge F.; Mendez, Rene A.; Orchard, Marcos

    2016-10-01

    Context. The best precision that can be achieved to estimate the location of a stellar-like object is a topic of permanent interest in the astrometric community. Aims: We analyze bounds for the best position estimation of a stellar-like object on a CCD detector array in a Bayesian setting where the position is unknown, but where we have access to a prior distribution. In contrast to a parametric setting where we estimate a parameter from observations, the Bayesian approach estimates a random object (I.e., the position is a random variable) from observations that are statistically dependent on the position. Methods: We characterize the Bayesian Cramér-Rao (CR) that bounds the minimum mean square error (MMSE) of the best estimator of the position of a point source on a linear CCD-like detector, as a function of the properties of detector, the source, and the background. Results: We quantify and analyze the increase in astrometric performance from the use of a prior distribution of the object position, which is not available in the classical parametric setting. This gain is shown to be significant for various observational regimes, in particular in the case of faint objects or when the observations are taken under poor conditions. Furthermore, we present numerical evidence that the MMSE estimator of this problem tightly achieves the Bayesian CR bound. This is a remarkable result, demonstrating that all the performance gains presented in our analysis can be achieved with the MMSE estimator. Conclusions: The Bayesian CR bound can be used as a benchmark indicator of the expected maximum positional precision of a set of astrometric measurements in which prior information can be incorporated. This bound can be achieved through the conditional mean estimator, in contrast to the parametric case where no unbiased estimator precisely reaches the CR bound.

  18. Indices estimated using REML/BLUP and introduction of a super-trait for the selection of progenies in popcorn.

    PubMed

    Vittorazzi, C; Amaral Junior, A T; Guimarães, A G; Viana, A P; Silva, F H L; Pena, G F; Daher, R F; Gerhardt, I F S; Oliveira, G H F; Pereira, M G

    2017-09-27

    Selection indices commonly utilize economic weights, which become arbitrary genetic gains. In popcorn, this is even more evident due to the negative correlation between the main characteristics of economic importance - grain yield and popping expansion. As an option in the use of classical biometrics as a selection index, the optimal procedure restricted maximum likelihood/best linear unbiased predictor (REML/BLUP) allows the simultaneous estimation of genetic parameters and the prediction of genotypic values. Based on the mixed model methodology, the objective of this study was to investigate the comparative efficiency of eight selection indices estimated by REML/BLUP for the effective selection of superior popcorn families in the eighth intrapopulation recurrent selection cycle. We also investigated the efficiency of the inclusion of the variable "expanded popcorn volume per hectare" in the most advantageous selection of superior progenies. In total, 200 full-sib families were evaluated in two different areas in the North and Northwest regions of the State of Rio de Janeiro, Brazil. The REML/BLUP procedure resulted in higher estimated gains than those obtained with classical biometric selection index methodologies and should be incorporated into the selection of progenies. The following indices resulted in higher gains in the characteristics of greatest economic importance: the classical selection index/values attributed by trial, via REML/BLUP, and the greatest genotypic values/expanded popcorn volume per hectare, via REML. The expanded popcorn volume per hectare characteristic enabled satisfactory gains in grain yield and popping expansion; this characteristic should be considered super-trait in popcorn breeding programs.

  19. Real-time determination of sarcomere length of a single cardiomyocyte during contraction

    PubMed Central

    Kalda, Mari; Vendelin, Marko

    2013-01-01

    Sarcomere length of a cardiomyocyte is an important control parameter for physiology studies on a single cell level; for instance, its accurate determination in real time is essential for performing single cardiomyocyte contraction experiments. The aim of this work is to develop an efficient and accurate method for estimating a mean sarcomere length of a contracting cardiomyocyte using microscopy images as an input. The novelty in developed method lies in 1) using unbiased measure of similarities to eliminate systematic errors from conventional autocorrelation function (ACF)-based methods when applied to region of interest of an image, 2) using a semianalytical, seminumerical approach for evaluating the similarity measure to take into account spatial dependence of neighboring image pixels, and 3) using a detrend algorithm to extract the sarcomere striation pattern content from the microscopy images. The developed sarcomere length estimation procedure has superior computational efficiency and estimation accuracy compared with the conventional ACF and spectral analysis-based methods using fast Fourier transform. As shown by analyzing synthetic images with the known periodicity, the estimates obtained by the developed method are more accurate at the subpixel level than ones obtained using ACF analysis. When applied in practice on rat cardiomyocytes, our method was found to be robust to the choice of the region of interest that may 1) include projections of carbon fibers and nucleus, 2) have uneven background, and 3) be slightly disoriented with respect to average direction of sarcomere striation pattern. The developed method is implemented in open-source software. PMID:23255581

  20. Hierarchical Bayes estimation of species richness and occupancy in spatially replicated surveys

    USGS Publications Warehouse

    Kery, M.; Royle, J. Andrew

    2008-01-01

    1. Species richness is the most widely used biodiversity metric, but cannot be observed directly as, typically, some species are overlooked. Imperfect detectability must therefore be accounted for to obtain unbiased species-richness estimates. When richness is assessed at multiple sites, two approaches can be used to estimate species richness: either estimating for each site separately, or pooling all samples. The first approach produces imprecise estimates, while the second loses site-specific information. 2. In contrast, a hierarchical Bayes (HB) multispecies site-occupancy model benefits from the combination of information across sites without losing site-specific information and also yields occupancy estimates for each species. The heart of the model is an estimate of the incompletely observed presence-absence matrix, a centrepiece of biogeography and monitoring studies. We illustrate the model using Swiss breeding bird survey data, and compare its estimates with the widely used jackknife species-richness estimator and raw species counts. 3. Two independent observers each conducted three surveys in 26 1-km(2) quadrats, and detected 27-56 (total 103) species. The average estimated proportion of species detected after three surveys was 0.87 under the HB model. Jackknife estimates were less precise (less repeatable between observers) than raw counts, but HB estimates were as repeatable as raw counts. The combination of information in the HB model thus resulted in species-richness estimates presumably at least as unbiased as previous approaches that correct for detectability, but without costs in precision relative to uncorrected, biased species counts. 4. Total species richness in the entire region sampled was estimated at 113.1 (CI 106-123); species detectability ranged from 0.08 to 0.99, illustrating very heterogeneous species detectability; and species occupancy was 0.06-0.96. Even after six surveys, absolute bias in observed occupancy was estimated at up to 0.40. 5. Synthesis and applications. The HB model for species-richness estimation combines information across sites and enjoys more precise, and presumably less biased, estimates than previous approaches. It also yields estimates of several measures of community size and composition. Covariates for occupancy and detectability can be included. We believe it has considerable potential for monitoring programmes as well as in biogeography and community ecology.

  1. Occupancy Modeling for Improved Accuracy and Understanding of Pathogen Prevalence and Dynamics

    PubMed Central

    Colvin, Michael E.; Peterson, James T.; Kent, Michael L.; Schreck, Carl B.

    2015-01-01

    Most pathogen detection tests are imperfect, with a sensitivity < 100%, thereby resulting in the potential for a false negative, where a pathogen is present but not detected. False negatives in a sample inflate the number of non-detections, negatively biasing estimates of pathogen prevalence. Histological examination of tissues as a diagnostic test can be advantageous as multiple pathogens can be examined and providing important information on associated pathological changes to the host. However, it is usually less sensitive than molecular or microbiological tests for specific pathogens. Our study objectives were to 1) develop a hierarchical occupancy model to examine pathogen prevalence in spring Chinook salmon Oncorhynchus tshawytscha and their distribution among host tissues 2) use the model to estimate pathogen-specific test sensitivities and infection rates, and 3) illustrate the effect of using replicate within host sampling on sample sizes required to detect a pathogen. We examined histological sections of replicate tissue samples from spring Chinook salmon O. tshawytscha collected after spawning for common pathogens seen in this population: Apophallus/echinostome metacercariae, Parvicapsula minibicornis, Nanophyetus salmincola/ metacercariae, and Renibacterium salmoninarum. A hierarchical occupancy model was developed to estimate pathogen and tissue-specific test sensitivities and unbiased estimation of host- and organ-level infection rates. Model estimated sensitivities and host- and organ-level infections rates varied among pathogens and model estimated infection rate was higher than prevalence unadjusted for test sensitivity, confirming that prevalence unadjusted for test sensitivity was negatively biased. The modeling approach provided an analytical approach for using hierarchically structured pathogen detection data from lower sensitivity diagnostic tests, such as histology, to obtain unbiased pathogen prevalence estimates with associated uncertainties. Accounting for test sensitivity using within host replicate samples also required fewer individual fish to be sampled. This approach is useful for evaluating pathogen or microbe community dynamics when test sensitivity is <100%. PMID:25738709

  2. Comparing least-squares and quantile regression approaches to analyzing median hospital charges.

    PubMed

    Olsen, Cody S; Clark, Amy E; Thomas, Andrea M; Cook, Lawrence J

    2012-07-01

    Emergency department (ED) and hospital charges obtained from administrative data sets are useful descriptors of injury severity and the burden to EDs and the health care system. However, charges are typically positively skewed due to costly procedures, long hospital stays, and complicated or prolonged treatment for few patients. The median is not affected by extreme observations and is useful in describing and comparing distributions of hospital charges. A least-squares analysis employing a log transformation is one approach for estimating median hospital charges, corresponding confidence intervals (CIs), and differences between groups; however, this method requires certain distributional properties. An alternate method is quantile regression, which allows estimation and inference related to the median without making distributional assumptions. The objective was to compare the log-transformation least-squares method to the quantile regression approach for estimating median hospital charges, differences in median charges between groups, and associated CIs. The authors performed simulations using repeated sampling of observed statewide ED and hospital charges and charges randomly generated from a hypothetical lognormal distribution. The median and 95% CI and the multiplicative difference between the median charges of two groups were estimated using both least-squares and quantile regression methods. Performance of the two methods was evaluated. In contrast to least squares, quantile regression produced estimates that were unbiased and had smaller mean square errors in simulations of observed ED and hospital charges. Both methods performed well in simulations of hypothetical charges that met least-squares method assumptions. When the data did not follow the assumed distribution, least-squares estimates were often biased, and the associated CIs had lower than expected coverage as sample size increased. Quantile regression analyses of hospital charges provide unbiased estimates even when lognormal and equal variance assumptions are violated. These methods may be particularly useful in describing and analyzing hospital charges from administrative data sets. © 2012 by the Society for Academic Emergency Medicine.

  3. Occupancy modeling for improved accuracy and understanding of pathogen prevalence and dynamics

    USGS Publications Warehouse

    Colvin, Michael E.; Peterson, James T.; Kent, Michael L.; Schreck, Carl B.

    2015-01-01

    Most pathogen detection tests are imperfect, with a sensitivity < 100%, thereby resulting in the potential for a false negative, where a pathogen is present but not detected. False negatives in a sample inflate the number of non-detections, negatively biasing estimates of pathogen prevalence. Histological examination of tissues as a diagnostic test can be advantageous as multiple pathogens can be examined and providing important information on associated pathological changes to the host. However, it is usually less sensitive than molecular or microbiological tests for specific pathogens. Our study objectives were to 1) develop a hierarchical occupancy model to examine pathogen prevalence in spring Chinook salmonOncorhynchus tshawytscha and their distribution among host tissues 2) use the model to estimate pathogen-specific test sensitivities and infection rates, and 3) illustrate the effect of using replicate within host sampling on sample sizes required to detect a pathogen. We examined histological sections of replicate tissue samples from spring Chinook salmon O. tshawytscha collected after spawning for common pathogens seen in this population:Apophallus/echinostome metacercariae, Parvicapsula minibicornis, Nanophyetus salmincola/metacercariae, and Renibacterium salmoninarum. A hierarchical occupancy model was developed to estimate pathogen and tissue-specific test sensitivities and unbiased estimation of host- and organ-level infection rates. Model estimated sensitivities and host- and organ-level infections rates varied among pathogens and model estimated infection rate was higher than prevalence unadjusted for test sensitivity, confirming that prevalence unadjusted for test sensitivity was negatively biased. The modeling approach provided an analytical approach for using hierarchically structured pathogen detection data from lower sensitivity diagnostic tests, such as histology, to obtain unbiased pathogen prevalence estimates with associated uncertainties. Accounting for test sensitivity using within host replicate samples also required fewer individual fish to be sampled. This approach is useful for evaluating pathogen or microbe community dynamics when test sensitivity is <100%.

  4. The Use of Propensity Scores and Observational Data to Estimate Randomized Controlled Trial Generalizability Bias

    PubMed Central

    Pressler, Taylor R.; Kaizar, Eloise E.

    2014-01-01

    While randomized controlled trials (RCT) are considered the “gold standard” for clinical studies, the use of exclusion criteria may impact the external validity of the results. It is unknown whether estimators of effect size are biased by excluding a portion of the target population from enrollment. We propose to use observational data to estimate the bias due to enrollment restrictions, which we term generalizability bias. In this paper we introduce a class of estimators for the generalizability bias and use simulation to study its properties in the presence of non-constant treatment effects. We find the surprising result that our estimators can be unbiased for the true generalizability bias even when all potentially confounding variables are not measured. In addition, our proposed doubly robust estimator performs well even for mis-specified models. PMID:23553373

  5. Camera traps and mark-resight models: The value of ancillary data for evaluating assumptions

    USGS Publications Warehouse

    Parsons, Arielle W.; Simons, Theodore R.; Pollock, Kenneth H.; Stoskopf, Michael K.; Stocking, Jessica J.; O'Connell, Allan F.

    2015-01-01

    Unbiased estimators of abundance and density are fundamental to the study of animal ecology and critical for making sound management decisions. Capture–recapture models are generally considered the most robust approach for estimating these parameters but rely on a number of assumptions that are often violated but rarely validated. Mark-resight models, a form of capture–recapture, are well suited for use with noninvasive sampling methods and allow for a number of assumptions to be relaxed. We used ancillary data from continuous video and radio telemetry to evaluate the assumptions of mark-resight models for abundance estimation on a barrier island raccoon (Procyon lotor) population using camera traps. Our island study site was geographically closed, allowing us to estimate real survival and in situ recruitment in addition to population size. We found several sources of bias due to heterogeneity of capture probabilities in our study, including camera placement, animal movement, island physiography, and animal behavior. Almost all sources of heterogeneity could be accounted for using the sophisticated mark-resight models developed by McClintock et al. (2009b) and this model generated estimates similar to a spatially explicit mark-resight model previously developed for this population during our study. Spatially explicit capture–recapture models have become an important tool in ecology and confer a number of advantages; however, non-spatial models that account for inherent individual heterogeneity may perform nearly as well, especially where immigration and emigration are limited. Non-spatial models are computationally less demanding, do not make implicit assumptions related to the isotropy of home ranges, and can provide insights with respect to the biological traits of the local population.

  6. Estimating rates of local extinction and colonization in colonial species and an extension to the metapopulation and community levels

    USGS Publications Warehouse

    Barbraud, C.; Nichols, J.D.; Hines, J.E.; Hafner, H.

    2003-01-01

    Coloniality has mainly been studied from an evolutionary perspective, but relatively few studies have developed methods for modelling colony dynamics. Changes in number of colonies over time provide a useful tool for predicting and evaluating the responses of colonial species to management and to environmental disturbance. Probabilistic Markov process models have been recently used to estimate colony site dynamics using presence-absence data when all colonies are detected in sampling efforts. Here, we define and develop two general approaches for the modelling and analysis of colony dynamics for sampling situations in which all colonies are, and are not, detected. For both approaches, we develop a general probabilistic model for the data and then constrain model parameters based on various hypotheses about colony dynamics. We use Akaike's Information Criterion (AIC) to assess the adequacy of the constrained models. The models are parameterised with conditional probabilities of local colony site extinction and colonization. Presence-absence data arising from Pollock's robust capture-recapture design provide the basis for obtaining unbiased estimates of extinction, colonization, and detection probabilities when not all colonies are detected. This second approach should be particularly useful in situations where detection probabilities are heterogeneous among colony sites. The general methodology is illustrated using presence-absence data on two species of herons (Purple Heron, Ardea purpurea and Grey Heron, Ardea cinerea). Estimates of the extinction and colonization rates showed interspecific differences and strong temporal and spatial variations. We were also able to test specific predictions about colony dynamics based on ideas about habitat change and metapopulation dynamics. We recommend estimators based on probabilistic modelling for future work on colony dynamics. We also believe that this methodological framework has wide application to problems in animal ecology concerning metapopulation and community dynamics.

  7. Estimating cosmic velocity fields from density fields and tidal tensors

    NASA Astrophysics Data System (ADS)

    Kitaura, Francisco-Shu; Angulo, Raul E.; Hoffman, Yehuda; Gottlöber, Stefan

    2012-10-01

    In this work we investigate the non-linear and non-local relation between cosmological density and peculiar velocity fields. Our goal is to provide an algorithm for the reconstruction of the non-linear velocity field from the fully non-linear density. We find that including the gravitational tidal field tensor using second-order Lagrangian perturbation theory based upon an estimate of the linear component of the non-linear density field significantly improves the estimate of the cosmic flow in comparison to linear theory not only in the low density, but also and more dramatically in the high-density regions. In particular we test two estimates of the linear component: the lognormal model and the iterative Lagrangian linearization. The present approach relies on a rigorous higher order Lagrangian perturbation theory analysis which incorporates a non-local relation. It does not require additional fitting from simulations being in this sense parameter free, it is independent of statistical-geometrical optimization and it is straightforward and efficient to compute. The method is demonstrated to yield an unbiased estimator of the velocity field on scales ≳5 h-1 Mpc with closely Gaussian distributed errors. Moreover, the statistics of the divergence of the peculiar velocity field is extremely well recovered showing a good agreement with the true one from N-body simulations. The typical errors of about 10 km s-1 (1σ confidence intervals) are reduced by more than 80 per cent with respect to linear theory in the scale range between 5 and 10 h-1 Mpc in high-density regions (δ > 2). We also find that iterative Lagrangian linearization is significantly superior in the low-density regime with respect to the lognormal model.

  8. Collinearity and Causal Diagrams: A Lesson on the Importance of Model Specification.

    PubMed

    Schisterman, Enrique F; Perkins, Neil J; Mumford, Sunni L; Ahrens, Katherine A; Mitchell, Emily M

    2017-01-01

    Correlated data are ubiquitous in epidemiologic research, particularly in nutritional and environmental epidemiology where mixtures of factors are often studied. Our objectives are to demonstrate how highly correlated data arise in epidemiologic research and provide guidance, using a directed acyclic graph approach, on how to proceed analytically when faced with highly correlated data. We identified three fundamental structural scenarios in which high correlation between a given variable and the exposure can arise: intermediates, confounders, and colliders. For each of these scenarios, we evaluated the consequences of increasing correlation between the given variable and the exposure on the bias and variance for the total effect of the exposure on the outcome using unadjusted and adjusted models. We derived closed-form solutions for continuous outcomes using linear regression and empirically present our findings for binary outcomes using logistic regression. For models properly specified, total effect estimates remained unbiased even when there was almost perfect correlation between the exposure and a given intermediate, confounder, or collider. In general, as the correlation increased, the variance of the parameter estimate for the exposure in the adjusted models increased, while in the unadjusted models, the variance increased to a lesser extent or decreased. Our findings highlight the importance of considering the causal framework under study when specifying regression models. Strategies that do not take into consideration the causal structure may lead to biased effect estimation for the original question of interest, even under high correlation.

  9. The Swift GRB Host Galaxy Legacy Survey

    NASA Astrophysics Data System (ADS)

    Perley, Daniel

    2015-08-01

    I will describe the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population and its redshift evolution from z=0 to z=7. Using unbiased selection criteria we have designated a subset of 119 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, Gemini, VLT, and Magellan to obtain complementary optical/NIR photometry and spectroscopy to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass distributions and their evolution with redshift, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to serve as tools for measuring and studying cosmic star-formation in the distant universe.

  10. Estimation of group means when adjusting for covariates in generalized linear models.

    PubMed

    Qu, Yongming; Luo, Junxiang

    2015-01-01

    Generalized linear models are commonly used to analyze categorical data such as binary, count, and ordinal outcomes. Adjusting for important prognostic factors or baseline covariates in generalized linear models may improve the estimation efficiency. The model-based mean for a treatment group produced by most software packages estimates the response at the mean covariate, not the mean response for this treatment group for the studied population. Although this is not an issue for linear models, the model-based group mean estimates in generalized linear models could be seriously biased for the true group means. We propose a new method to estimate the group mean consistently with the corresponding variance estimation. Simulation showed the proposed method produces an unbiased estimator for the group means and provided the correct coverage probability. The proposed method was applied to analyze hypoglycemia data from clinical trials in diabetes. Copyright © 2014 John Wiley & Sons, Ltd.

  11. An analysis of the uncertainty and bias in DCE-MRI measurements using the spoiled gradient-recalled echo pulse sequence.

    PubMed

    Subashi, Ergys; Choudhury, Kingshuk R; Johnson, G Allan

    2014-03-01

    The pharmacokinetic parameters derived from dynamic contrast-enhanced (DCE) MRI have been used in more than 100 phase I trials and investigator led studies. A comparison of the absolute values of these quantities requires an estimation of their respective probability distribution function (PDF). The statistical variation of the DCE-MRI measurement is analyzed by considering the fundamental sources of error in the MR signal intensity acquired with the spoiled gradient-echo (SPGR) pulse sequence. The variance in the SPGR signal intensity arises from quadrature detection and excitation flip angle inconsistency. The noise power was measured in 11 phantoms of contrast agent concentration in the range [0-1] mM (in steps of 0.1 mM) and in onein vivo acquisition of a tumor-bearing mouse. The distribution of the flip angle was determined in a uniform 10 mM CuSO4 phantom using the spin echo double angle method. The PDF of a wide range of T1 values measured with the varying flip angle (VFA) technique was estimated through numerical simulations of the SPGR equation. The resultant uncertainty in contrast agent concentration was incorporated in the most common model of tracer exchange kinetics and the PDF of the derived pharmacokinetic parameters was studied numerically. The VFA method is an unbiased technique for measuringT1 only in the absence of bias in excitation flip angle. The time-dependent concentration of the contrast agent measured in vivo is within the theoretically predicted uncertainty. The uncertainty in measuring K(trans) with SPGR pulse sequences is of the same order, but always higher than, the uncertainty in measuring the pre-injection longitudinal relaxation time (T10). The lowest achievable bias/uncertainty in estimating this parameter is approximately 20%-70% higher than the bias/uncertainty in the measurement of the pre-injection T1 map. The fractional volume parameters derived from the extended Tofts model were found to be extremely sensitive to the variance in signal intensity. The SNR of the pre-injection T1 map indicates the limiting precision with which K(trans) can be calculated. Current small-animal imaging systems and pulse sequences robust to motion artifacts have the capacity for reproducible quantitative acquisitions with DCE-MRI. In these circumstances, it is feasible to achieve a level of precision limited only by physiologic variability.

  12. Short Course Introduction to Quantitative Mineral Resource Assessments

    USGS Publications Warehouse

    Singer, Donald A.

    2007-01-01

    This is an abbreviated text supplementing the content of three sets of slides used in a short course that has been presented by the author at several workshops. The slides should be viewed in the order of (1) Introduction and models, (2) Delineation and estimation, and (3) Combining estimates and summary. References cited in the slides are listed at the end of this text. The purpose of the three-part form of mineral resource assessments discussed in the accompanying slides is to make unbiased quantitative assessments in a format needed in decision-support systems so that consequences of alternative courses of action can be examined. The three-part form of mineral resource assessments was developed to assist policy makers evaluate the consequences of alternative courses of action with respect to land use and mineral-resource development. The audience for three-part assessments is a governmental or industrial policy maker, a manager of exploration, a planner of regional development, or similar decision-maker. Some of the tools and models presented here will be useful for selection of exploration sites, but that is a side benefit, not the goal. To provide unbiased information, we recommend the three-part form of mineral resource assessments where general locations of undiscovered deposits are delineated from a deposit type's geologic setting, frequency distributions of tonnages and grades of well-explored deposits serve as models of grades and tonnages of undiscovered deposits, and number of undiscovered deposits are estimated probabilistically by type. The internally consistent descriptive, grade and tonnage, deposit density, and economic models used in the design of the three-part form of assessments reduce the chances of biased estimates of the undiscovered resources. What and why quantitative resource assessments: The kind of assessment recommended here is founded in decision analysis in order to provide a framework for making decisions concerning mineral resources under conditions of uncertainty. What this means is that we start with the question of what kinds of questions is the decision maker trying to resolve and what forms of information would aid in resolving these questions. Some applications of mineral resource assessments: To plan and guide exploration programs, to assist in land use planning, to plan the location of infrastructure, to estimate mineral endowment, and to identify deposits that present special environmental challenges. Why not just rank prospects / areas? Need for financial analysis, need for comparison with other land uses, need for comparison with distant tracts of land, need to know how uncertain the estimates are, need for consideration of economic and environmental consequences of possible development. Our goal is to provide unbiased information useful to decision-makers.

  13. Use of the landmark method to address immortal person-time bias in comparative effectiveness research: a simulation study.

    PubMed

    Mi, Xiaojuan; Hammill, Bradley G; Curtis, Lesley H; Lai, Edward Chia-Cheng; Setoguchi, Soko

    2016-11-20

    Observational comparative effectiveness and safety studies are often subject to immortal person-time, a period of follow-up during which outcomes cannot occur because of the treatment definition. Common approaches, like excluding immortal time from the analysis or naïvely including immortal time in the analysis, are known to result in biased estimates of treatment effect. Other approaches, such as the Mantel-Byar and landmark methods, have been proposed to handle immortal time. Little is known about the performance of the landmark method in different scenarios. We conducted extensive Monte Carlo simulations to assess the performance of the landmark method compared with other methods in settings that reflect realistic scenarios. We considered four landmark times for the landmark method. We found that the Mantel-Byar method provided unbiased estimates in all scenarios, whereas the exclusion and naïve methods resulted in substantial bias when the hazard of the event was constant or decreased over time. The landmark method performed well in correcting immortal person-time bias in all scenarios when the treatment effect was small, and provided unbiased estimates when there was no treatment effect. The bias associated with the landmark method tended to be small when the treatment rate was higher in the early follow-up period than it was later. These findings were confirmed in a case study of chronic obstructive pulmonary disease. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Annual survival estimation of migratory songbirds confounded by incomplete breeding site-fidelity: Study designs that may help

    USGS Publications Warehouse

    Marshall, M.R.; Diefenbach, D.R.; Wood, L.A.; Cooper, R.J.

    2004-01-01

    Many species of bird exhibit varying degrees of site-fidelity to the previous year's territory or breeding area, a phenomenon we refer to as incomplete breeding site-fidelity. If the territory they occupy is located beyond the bounds of the study area or search area (i.e., they have emigrated from the study area), the bird will go undetected and is therefore indistinguishable from dead individuals in capture-mark-recapture studies. Differential emigration rates confound inferences regarding differences in survival between sexes and among species if apparent survival rates are used as estimates of true survival. Moreover, the bias introduced by using apparent survival rates for true survival rates can have profound effects on the predictions of population persistence through time, source/sink dynamics, and other aspects of life-history theory. We investigated four study design and analysis approaches that result in apparent survival estimates that are closer to true survival estimates. Our motivation for this research stemmed from a multi-year capture-recapture study of Prothonotary Warblers (Protonotaria citrea) on multiple study plots within a larger landscape of suitable breeding habitat where substantial inter-annual movements of marked individuals among neighboring study plots was documented. We wished to quantify the effects of this type of movement on annual survival estimation. The first two study designs we investigated involved marking birds in a core area and resighting them in the core as well as an area surrounding the core. For the first of these two designs, we demonstrated that as the resighting area surrounding the core gets progressively larger, and more "emigrants" are resighted, apparent survival estimates begin to approximate true survival rates (bias < 0.01). However, given observed inter-annual movements of birds, it is likely to be logistically impractical to resight birds on sufficiently large surrounding areas to minimize bias. Therefore, as an alternative protocol, we analyzed the data with subsets of three progressively larger areas surrounding the core. The data subsets provided four estimates of apparent survival that asymptotically approached true survival. This study design and analytical approach is likely to be logistically feasible in field settings and yields estimates of true survival unbiased (bias < 0.03) by incomplete breeding site-fidelity over a range of inter-annual territory movement patterns. The third approach we investigated used a robust design data collection and analysis approach. This approach resulted in estimates of survival that were unbiased (bias < 0.02), but were very imprecise and likely would not yield reliable estimates in field situations. The fourth approach utilized a fixed study area size, but modeled detection probability as a function of bird proximity to the study plot boundary (e.g., those birds closest to the edge are more likely to emigrate). This approach also resulted in estimates of survival that were unbiased (bias < 0.02), but because the individual covariates were normalized, the average capture probability was 0.50, and thus did not provide an accurate estimate of the true capture probability. Our results show that the core-area with surrounding resight-only can provide estimates of survival that are not biased by the effects of incomplete breeding site-fidelity. ?? 2004 Museu de Cie??ncies Naturals.

  15. The two-dimensional Monte Carlo: a new methodologic paradigm for dose reconstruction for epidemiological studies.

    PubMed

    Simon, Steven L; Hoffman, F Owen; Hofer, Eduard

    2015-01-01

    Retrospective dose estimation, particularly dose reconstruction that supports epidemiological investigations of health risk, relies on various strategies that include models of physical processes and exposure conditions with detail ranging from simple to complex. Quantification of dose uncertainty is an essential component of assessments for health risk studies since, as is well understood, it is impossible to retrospectively determine the true dose for each person. To address uncertainty in dose estimation, numerical simulation tools have become commonplace and there is now an increased understanding about the needs and what is required for models used to estimate cohort doses (in the absence of direct measurement) to evaluate dose response. It now appears that for dose-response algorithms to derive the best, unbiased estimate of health risk, we need to understand the type, magnitude and interrelationships of the uncertainties of model assumptions, parameters and input data used in the associated dose estimation models. Heretofore, uncertainty analysis of dose estimates did not always properly distinguish between categories of errors, e.g., uncertainty that is specific to each subject (i.e., unshared error), and uncertainty of doses from a lack of understanding and knowledge about parameter values that are shared to varying degrees by numbers of subsets of the cohort. While mathematical propagation of errors by Monte Carlo simulation methods has been used for years to estimate the uncertainty of an individual subject's dose, it was almost always conducted without consideration of dependencies between subjects. In retrospect, these types of simple analyses are not suitable for studies with complex dose models, particularly when important input data are missing or otherwise not available. The dose estimation strategy presented here is a simulation method that corrects the previous deficiencies of analytical or simple Monte Carlo error propagation methods and is termed, due to its capability to maintain separation between shared and unshared errors, the two-dimensional Monte Carlo (2DMC) procedure. Simply put, the 2DMC method simulates alternative, possibly true, sets (or vectors) of doses for an entire cohort rather than a single set that emerges when each individual's dose is estimated independently from other subjects. Moreover, estimated doses within each simulated vector maintain proper inter-relationships such that the estimated doses for members of a cohort subgroup that share common lifestyle attributes and sources of uncertainty are properly correlated. The 2DMC procedure simulates inter-individual variability of possibly true doses within each dose vector and captures the influence of uncertainty in the values of dosimetric parameters across multiple realizations of possibly true vectors of cohort doses. The primary characteristic of the 2DMC approach, as well as its strength, are defined by the proper separation between uncertainties shared by members of the entire cohort or members of defined cohort subsets, and uncertainties that are individual-specific and therefore unshared.

  16. Fast Geostatistical Inversion using Randomized Matrix Decompositions and Sketchings for Heterogeneous Aquifer Characterization

    NASA Astrophysics Data System (ADS)

    O'Malley, D.; Le, E. B.; Vesselinov, V. V.

    2015-12-01

    We present a fast, scalable, and highly-implementable stochastic inverse method for characterization of aquifer heterogeneity. The method utilizes recent advances in randomized matrix algebra and exploits the structure of the Quasi-Linear Geostatistical Approach (QLGA), without requiring a structured grid like Fast-Fourier Transform (FFT) methods. The QLGA framework is a more stable version of Gauss-Newton iterates for a large number of unknown model parameters, but provides unbiased estimates. The methods are matrix-free and do not require derivatives or adjoints, and are thus ideal for complex models and black-box implementation. We also incorporate randomized least-square solvers and data-reduction methods, which speed up computation and simulate missing data points. The new inverse methodology is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. Inversion results based on series of synthetic problems with steady-state and transient calibration data are presented.

  17. Inferring animal densities from tracking data using Markov chains.

    PubMed

    Whitehead, Hal; Jonsen, Ian D

    2013-01-01

    The distributions and relative densities of species are keys to ecology. Large amounts of tracking data are being collected on a wide variety of animal species using several methods, especially electronic tags that record location. These tracking data are effectively used for many purposes, but generally provide biased measures of distribution, because the starts of the tracks are not randomly distributed among the locations used by the animals. We introduce a simple Markov-chain method that produces unbiased measures of relative density from tracking data. The density estimates can be over a geographical grid, and/or relative to environmental measures. The method assumes that the tracked animals are a random subset of the population in respect to how they move through the habitat cells, and that the movements of the animals among the habitat cells form a time-homogenous Markov chain. We illustrate the method using simulated data as well as real data on the movements of sperm whales. The simulations illustrate the bias introduced when the initial tracking locations are not randomly distributed, as well as the lack of bias when the Markov method is used. We believe that this method will be important in giving unbiased estimates of density from the growing corpus of animal tracking data.

  18. In search of a corrected prescription drug elasticity estimate: a meta-regression approach.

    PubMed

    Gemmill, Marin C; Costa-Font, Joan; McGuire, Alistair

    2007-06-01

    An understanding of the relationship between cost sharing and drug consumption depends on consistent and unbiased price elasticity estimates. However, there is wide heterogeneity among studies, which constrains the applicability of elasticity estimates for empirical purposes and policy simulation. This paper attempts to provide a corrected measure of the drug price elasticity by employing meta-regression analysis (MRA). The results indicate that the elasticity estimates are significantly different from zero, and the corrected elasticity is -0.209 when the results are made robust to heteroskedasticity and clustering of observations. Elasticity values are higher when the study was published in an economic journal, when the study employed a greater number of observations, and when the study used aggregate data. Elasticity estimates are lower when the institutional setting was a tax-based health insurance system.

  19. Kullback-Leibler information in resolving natural resource conflicts when definitive data exist

    USGS Publications Warehouse

    Anderson, D.R.; Burnham, K.P.; White, Gary C.

    2001-01-01

    Conflicts often arise in the management of natural resources. Often they result from differing perceptions, varying interpretations of the law, and self-interests among stakeholder groups (for example, the values and perceptions about spotted owls and forest management differ markedly among environmental groups, government regulatory agencies, and timber industries). We extend the conceptual approach to conflict resolution of Anderson et al. (1999) by using information-theoretic methods to provide quantitative evidence for differing stakeholder positions. Importantly, we assume that relevant empirical data exist that are central to the potential resolution of the conflict. We present a hypothetical example involving an experiment to assess potential effects of a chemical on monthly survival probabilities of the hen clam (Spisula solidissima). The conflict centers on 3 stakeholder positions: 1) no effect, 2) an acute effect, and 3) an acute and chronic effect of the chemical treatment. Such data were given to 18 analytical teams to make independent analyses and provide the relative evidence for each of 3 stakeholder positions in the conflict. The empirical evidence strongly supports only one of the 3 positions in the conflict: the application of the chemical causes acute and chronic effects on monthly survival, following treatment. Formal inference from all the stakeholder positions is provided for the 2 key parameters underlying the hen clam controversy. The estimates of these parameters were essentially unbiased (the relative bias for the control and treatment group's survival probability was -0.857% and 1.400%, respectively) and precise (coefficients of variation were 0.576% and 2.761%, respectively). The advantages of making formal inference from all the models, rather than drawing conclusions from only the estimated best model, is illustrated. Finally, we contrast information-theoretic and Bayesian approaches in terms of how positions in the controversy enter the formal analysis.

  20. On the validity of measuring change over time in routine clinical assessment: a close examination of item-level response shifts in psychosomatic inpatients.

    PubMed

    Nolte, S; Mierke, A; Fischer, H F; Rose, M

    2016-06-01

    Significant life events such as severe health status changes or intensive medical treatment often trigger response shifts in individuals that may hamper the comparison of measurements over time. Drawing from the Oort model, this study aims at detecting response shift at the item level in psychosomatic inpatients and evaluating its impact on the validity of comparing repeated measurements. Complete pretest and posttest data were available from 1188 patients who had filled out the ICD-10 Symptom Rating (ISR) scale at admission and discharge, on average 24 days after intake. Reconceptualization, reprioritization, and recalibration response shifts were explored applying tests of measurement invariance. In the item-level approach, all model parameters were constrained to be equal between pretest and posttest. If non-invariance was detected, these were linked to the different types of response shift. When constraining across-occasion model parameters, model fit worsened as indicated by a significant Satorra-Bentler Chi-square difference test suggesting potential presence of response shifts. A close examination revealed presence of two types of response shift, i.e., (non)uniform recalibration and both higher- and lower-level reconceptualization response shifts leading to four model adjustments. Our analyses suggest that psychosomatic inpatients experienced some response shifts during their hospital stay. According to the hierarchy of measurement invariance, however, only one of the detected non-invariances is critical for unbiased mean comparisons over time, which did not have a substantial impact on estimating change. Hence, the use of the ISR can be recommended for outcomes assessment in clinical routine, as change score estimates do not seem hampered by response shift effects.

  1. Quantitative Assessment of In-solution Digestion Efficiency Identifies Optimal Protocols for Unbiased Protein Analysis*

    PubMed Central

    León, Ileana R.; Schwämmle, Veit; Jensen, Ole N.; Sprenger, Richard R.

    2013-01-01

    The majority of mass spectrometry-based protein quantification studies uses peptide-centric analytical methods and thus strongly relies on efficient and unbiased protein digestion protocols for sample preparation. We present a novel objective approach to assess protein digestion efficiency using a combination of qualitative and quantitative liquid chromatography-tandem MS methods and statistical data analysis. In contrast to previous studies we employed both standard qualitative as well as data-independent quantitative workflows to systematically assess trypsin digestion efficiency and bias using mitochondrial protein fractions. We evaluated nine trypsin-based digestion protocols, based on standard in-solution or on spin filter-aided digestion, including new optimized protocols. We investigated various reagents for protein solubilization and denaturation (dodecyl sulfate, deoxycholate, urea), several trypsin digestion conditions (buffer, RapiGest, deoxycholate, urea), and two methods for removal of detergents before analysis of peptides (acid precipitation or phase separation with ethyl acetate). Our data-independent quantitative liquid chromatography-tandem MS workflow quantified over 3700 distinct peptides with 96% completeness between all protocols and replicates, with an average 40% protein sequence coverage and an average of 11 peptides identified per protein. Systematic quantitative and statistical analysis of physicochemical parameters demonstrated that deoxycholate-assisted in-solution digestion combined with phase transfer allows for efficient, unbiased generation and recovery of peptides from all protein classes, including membrane proteins. This deoxycholate-assisted protocol was also optimal for spin filter-aided digestions as compared with existing methods. PMID:23792921

  2. SMAP Level 4 Surface and Root Zone Soil Moisture

    NASA Technical Reports Server (NTRS)

    Reichle, R.; De Lannoy, G.; Liu, Q.; Ardizzone, J.; Kimball, J.; Koster, R.

    2017-01-01

    The SMAP Level 4 soil moisture (L4_SM) product provides global estimates of surface and root zone soil moisture, along with other land surface variables and their error estimates. These estimates are obtained through assimilation of SMAP brightness temperature observations into the Goddard Earth Observing System (GEOS-5) land surface model. The L4_SM product is provided at 9 km spatial and 3-hourly temporal resolution and with about 2.5 day latency. The soil moisture and temperature estimates in the L4_SM product are validated against in situ observations. The L4_SM product meets the required target uncertainty of 0.04 m(exp. 3)m(exp. -3), measured in terms of unbiased root-mean-square-error, for both surface and root zone soil moisture.

  3. The Maximum Likelihood Solution for Inclination-only Data

    NASA Astrophysics Data System (ADS)

    Arason, P.; Levi, S.

    2006-12-01

    The arithmetic means of inclination-only data are known to introduce a shallowing bias. Several methods have been proposed to estimate unbiased means of the inclination along with measures of the precision. Most of the inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all these methods require various assumptions and approximations that are inappropriate for many data sets. For some steep and dispersed data sets, the estimates provided by these methods are significantly displaced from the peak of the likelihood function to systematically shallower inclinations. The problem in locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest. This is because some elements of the log-likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study we succeeded in analytically cancelling exponential elements from the likelihood function, and we are now able to calculate its value for any location in the parameter space and for any inclination-only data set, with full accuracy. Furtermore, we can now calculate the partial derivatives of the likelihood function with desired accuracy. Locating the maximum likelihood without the assumptions required by previous methods is now straight forward. The information to separate the mean inclination from the precision parameter will be lost for very steep and dispersed data sets. It is worth noting that the likelihood function always has a maximum value. However, for some dispersed and steep data sets with few samples, the likelihood function takes its highest value on the boundary of the parameter space, i.e. at inclinations of +/- 90 degrees, but with relatively well defined dispersion. Our simulations indicate that this occurs quite frequently for certain data sets, and relatively small perturbations in the data will drive the maxima to the boundary. We interpret this to indicate that, for such data sets, the information needed to separate the mean inclination and the precision parameter is permanently lost. To assess the reliability and accuracy of our method we generated large number of random Fisher-distributed data sets and used seven methods to estimate the mean inclination and precision paramenter. These comparisons are described by Levi and Arason at the 2006 AGU Fall meeting. The results of the various methods is very favourable to our new robust maximum likelihood method, which, on average, is the most reliable, and the mean inclination estimates are the least biased toward shallow values. Further information on our inclination-only analysis can be obtained from: http://www.vedur.is/~arason/paleomag

  4. Development of a Multicomponent Prediction Model for Acute Esophagitis in Lung Cancer Patients Receiving Chemoradiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De Ruyck, Kim, E-mail: kim.deruyck@UGent.be; Sabbe, Nick; Oberije, Cary

    2011-10-01

    Purpose: To construct a model for the prediction of acute esophagitis in lung cancer patients receiving chemoradiotherapy by combining clinical data, treatment parameters, and genotyping profile. Patients and Methods: Data were available for 273 lung cancer patients treated with curative chemoradiotherapy. Clinical data included gender, age, World Health Organization performance score, nicotine use, diabetes, chronic disease, tumor type, tumor stage, lymph node stage, tumor location, and medical center. Treatment parameters included chemotherapy, surgery, radiotherapy technique, tumor dose, mean fractionation size, mean and maximal esophageal dose, and overall treatment time. A total of 332 genetic polymorphisms were considered in 112 candidatemore » genes. The predicting model was achieved by lasso logistic regression for predictor selection, followed by classic logistic regression for unbiased estimation of the coefficients. Performance of the model was expressed as the area under the curve of the receiver operating characteristic and as the false-negative rate in the optimal point on the receiver operating characteristic curve. Results: A total of 110 patients (40%) developed acute esophagitis Grade {>=}2 (Common Terminology Criteria for Adverse Events v3.0). The final model contained chemotherapy treatment, lymph node stage, mean esophageal dose, gender, overall treatment time, radiotherapy technique, rs2302535 (EGFR), rs16930129 (ENG), rs1131877 (TRAF3), and rs2230528 (ITGB2). The area under the curve was 0.87, and the false-negative rate was 16%. Conclusion: Prediction of acute esophagitis can be improved by combining clinical, treatment, and genetic factors. A multicomponent prediction model for acute esophagitis with a sensitivity of 84% was constructed with two clinical parameters, four treatment parameters, and four genetic polymorphisms.« less

  5. Variation in cassava germplasm for tolerance to post-harvest physiological deterioration.

    PubMed

    Venturini, M T; Santos, L R; Vildoso, C I A; Santos, V S; Oliveira, E J

    2016-05-06

    Tolerant varieties can effectively control post-harvest physiological deterioration (PPD) of cassava, although knowledge on the genetic variability and inheritance of this trait is needed. The objective of this study was to estimate genetic parameters and identify sources of tolerance to PPD and their stability in cassava accessions. Roots from 418 cassava accessions, grown in four independent experiments, were evaluated for PPD tolerance 0, 2, 5, and 10 days post-harvest. Data were transformed into area under the PPD-progress curve (AUP-PPD) to quantify tolerance. Genetic parameters, stability (Si), adaptability (Ai), and the joint analysis of stability and adaptability (Zi) were obtained via residual maximum likelihood (REML) and best linear unbiased prediction (BLUP) methods. Variance in the genotype (G) x environment (E) interaction and genotypic variance were important for PPD tolerance. Individual broad-sense heritability (hg(2)= 0.38 ± 0.04) and average heritability in accessions (hmg(2)= 0.52) showed high genetic control of PPD tolerance. Genotypic correlation of AUP-PPD in different experiments was of medium magnitude (ȓgA = 0.42), indicating significant G x E interaction. The predicted genotypic values o f G x E free of interaction (û + ĝi) showed high variation. Of the 30 accessions with high Zi, 19 were common to û + ĝi, Si, and Ai parameters. The genetic gain with selection of these 19 cassava accessions was -55.94, -466.86, -397.72, and -444.03% for û + ĝi, Si, Ai, and Zi, respectively, compared with the overall mean for each parameter. These results demonstrate the variability and potential of cassava germplasm to introduce PPD tolerance in commercial varieties.

  6. Fast State-Space Methods for Inferring Dendritic Synaptic Connectivity

    DTIC Science & Technology

    2013-08-08

    the results of 100 simulations with the same parameters as in Figures 4 and 5. As expected, the LARS/LARS+ results are (downward) biased and have low...with a strength slightly biased toward lower values. To measure the variability of the results across the 20 simulations , we computed for each...are downward biased and have low variance, and the OLS results are unbiased but have high variance. Note that for LARS+ the values above the median are

  7. Efficacy of calf:cow ratios for estimating calf production of arctic caribou

    USGS Publications Warehouse

    Cameron, R.D.; Griffith, B.; Parrett, L.S.; White, R.G.

    2013-01-01

    Caribou (Rangifer tarandus granti) calf:cow ratios (CCR) computed from composition counts obtained on arctic calving grounds are biased estimators of net calf production (NCP, the product of parturition rate and early calf survival) for sexually-mature females. Sexually-immature 2-year-old females, which are indistinguishable from sexually-mature females without calves, are included in the denominator, thereby biasing the calculated ratio low. This underestimate increases with the proportion of 2-year-old females in the population. We estimated the magnitude of this error with deterministic simulations under three scenarios of calf and yearling annual survival (respectively: low, 60 and 70%; medium, 70 and 80%; high, 80 and 90%) for five levels of unbiased NCP: 20, 40, 60, 80, and 100%. We assumed a survival rate of 90% for both 2-year-old and mature females. For each NCP, we computed numbers of 2-year-old females surviving annually and increased the denominator of CCR accordingly. We then calculated a series of hypothetical “observed” CCRs, which stabilized during the last 6 years of the simulations, and documented the degree to which each 6-year mean CCR differed from the corresponding NCP. For the three calf and yearling survival scenarios, proportional underestimates of NCP by CCR ranged 0.046–0.156, 0.058–0.187, and 0.071–0.216, respectively. Unfortunately, because parturition and survival rates are typically variable (i.e., age distribution is unstable), the magnitude of the error is not predictable without substantial supporting information. We recommend maintaining a sufficient sample of known-age radiocollared females in each herd and implementing a regular relocation schedule during the calving period to obtain unbiased estimates of both parturition rate and NCP.

  8. The cosmic transparency measured with Type Ia supernovae: implications for intergalactic dust

    NASA Astrophysics Data System (ADS)

    Goobar, Ariel; Dhawan, Suhail; Scolnic, Daniel

    2018-04-01

    Observations of high-redshift Type Ia supernovae (SNe Ia) are used to study the cosmic transparency at optical wavelengths. Assuming a flat ΛCDM cosmological model based on BAO and CMB results, redshift dependent deviations of SN Ia distances are used to constrain mechanisms that would dim light. The analysis is based on the most recent Pantheon SN compilation, for which there is a 0.03± 0.01 {(stat)} mag discrepancy in the distant supernova distance moduli relative to the ΛCDM model anchored by supernovae at z < 0.05. While there are known systematic uncertainties that combined could explain the observed offset, here we entertain the possibility that the discrepancy may instead be explained by scattering of supernova light in the intergalactic medium (IGM). We focus on two effects: Compton scattering by free electrons and extinction by dust in the IGM. We find that if the discrepancy is due entirely to dimming by dust, the measurements can be modeled with a cosmic dust density Ω _IGM^dust = 8 \\cdot 10^{-5} (1+z)^{-1}, corresponding to an average attenuation of 2 . 10-5 mag Mpc-1 in V-band. Forthcoming SN Ia studies may provide a definitive measurement of the IGM dust properties, while still providing an unbiased estimate of cosmological parameters by introducing additional parameters in the global fits to the observations.

  9. Spectroscopic analysis of Cepheid variables with 2D radiation-hydrodynamic simulations

    NASA Astrophysics Data System (ADS)

    Vasilyev, Valeriy

    2018-06-01

    The analysis of chemical enrichment history of dwarf galaxies allows to derive constraints on their formation and evolution. In this context, Cepheids play a very important role, as these periodically variable stars provide a means to obtain accurate distances. Besides, chemical composition of Cepheids can provide a strong constraint on the chemical evolution of the system. Standard spectroscopic analysis of Cepheids is based on using one-dimensional (1D) hydrostatic model atmospheres, with convection parametrised using the mixing-length theory. However, this quasi-static approach has theoretically not been validated. In my talk, I will discuss the validity of the quasi-static approximation in spectroscopy of short-periodic Cepheids. I will show the results obtained using a 2D time-dependent envelope model of a pulsating star computed with the radiation-hydrodynamics code CO5BOLD. I will then describe the impact of new models on the spectroscopic diagnostic of the effective temperature, surface gravity, microturbulent velocity, and metallicity. One of the interesting findings of my work is that 1D model atmospheres provide unbiased estimates of stellar parameters and abundances of Cepheid variables for certain phases of their pulsations. Convective inhomogeneities, however, also introduce biases. I will then discuss how these results can be used in a wider parameter space of pulsating stars and present an outlook for the future studies.

  10. The cosmic transparency measured with Type Ia supernovae: implications for intergalactic dust

    NASA Astrophysics Data System (ADS)

    Goobar, Ariel; Dhawan, Suhail; Scolnic, Daniel

    2018-06-01

    Observations of high-redshift Type Ia supernovae (SNe Ia) are used to study the cosmic transparency at optical wavelengths. Assuming a flat Λ cold dark matter (ΛCDM) cosmological model based on baryon acoustic oscillations and cosmic microwave background measurements, redshift dependent deviations of SN Ia distances are used to constrain mechanisms that would dim light. The analysis is based on the most recent Pantheon SN compilation, for which there is a 0.03 ± 0.01 {({stat})} mag discrepancy in the distant supernova distance moduli relative to the ΛCDM model anchored by supernovae at z < 0.05. While there are known systematic uncertainties that combined could explain the observed offset, here we entertain the possibility that the discrepancy may instead be explained by scattering of supernova light in the intergalactic medium (IGM). We focus on two effects: Compton scattering by free electrons and extinction by dust in the IGM. We find that if the discrepancy is entirely due to dimming by dust, the measurements can be modelled with a cosmic dust density Ω _IGM^dust = 8 × 10^{-5} (1+z)^{-1}, corresponding to an average attenuation of 2 × 10-5 mag Mpc-1 in V band. Forthcoming SN Ia studies may provide a definitive measurement of the IGM dust properties, while still providing an unbiased estimate of cosmological parameters by introducing additional parameters in the global fits to the observations.

  11. Depth-Duration Frequency of Precipitation for Oklahoma

    USGS Publications Warehouse

    Tortorelli, Robert L.; Rea, Alan; Asquith, William H.

    1999-01-01

    A regional frequency analysis was conducted to estimate the depth-duration frequency of precipitation for 12 durations in Oklahoma (15, 30, and 60 minutes; 1, 2, 3, 6, 12, and 24 hours; and 1, 3, and 7 days). Seven selected frequencies, expressed as recurrence intervals, were investigated (2, 5, 10, 25, 50, 100, and 500 years). L-moment statistics were used to summarize depth-duration data and to determine the appropriate statistical distributions. Three different rain-gage networks provided the data (15minute, 1-hour, and 1-day). The 60-minute, and 1-hour; and the 24-hour, and 1-day durations were analyzed separately. Data were used from rain-gage stations with at least 10-years of record and within Oklahoma or about 50 kilometers into bordering states. Precipitation annual maxima (depths) were determined from the data for 110 15-minute, 141 hourly, and 413 daily stations. The L-moment statistics for depths for all durations were calculated for each station using unbiased L-mo-ment estimators for the mean, L-scale, L-coefficient of variation, L-skew, and L-kur-tosis. The relation between L-skew and L-kurtosis (L-moment ratio diagram) and goodness-of-fit measures were used to select the frequency distributions. The three-parameter generalized logistic distribution was selected to model the frequencies of 15-, 30-, and 60-minute annual maxima; and the three-parameter generalized extreme-value distribution was selected to model the frequencies of 1-hour to 7-day annual maxima. The mean for each station and duration was corrected for the bias associated with fixed interval recording of precipitation amounts. The L-scale and spatially averaged L-skew statistics were used to compute the location, scale, and shape parameters of the selected distribution for each station and duration. The three parameters were used to calculate the depth-duration-frequency relations for each station. The precipitation depths for selected frequencies were contoured from weighted depth surfaces to produce maps from which the precipitation depth-duration-frequency curve for selected storm durations can be determined for any site in Oklahoma.

  12. Comparing population size estimators for plethodontid salamanders

    USGS Publications Warehouse

    Bailey, L.L.; Simons, T.R.; Pollock, K.H.

    2004-01-01

    Despite concern over amphibian declines, few studies estimate absolute abundances because of logistic and economic constraints and previously poor estimator performance. Two estimation approaches recommended for amphibian studies are mark-recapture and depletion (or removal) sampling. We compared abundance estimation via various mark-recapture and depletion methods, using data from a three-year study of terrestrial salamanders in Great Smoky Mountains National Park. Our results indicate that short-term closed-population, robust design, and depletion methods estimate surface population of salamanders (i.e., those near the surface and available for capture during a given sampling occasion). In longer duration studies, temporary emigration violates assumptions of both open- and closed-population mark-recapture estimation models. However, if the temporary emigration is completely random, these models should yield unbiased estimates of the total population (superpopulation) of salamanders in the sampled area. We recommend using Pollock's robust design in mark-recapture studies because of its flexibility to incorporate variation in capture probabilities and to estimate temporary emigration probabilities.

  13. Unbiased Rare Event Sampling in Spatial Stochastic Systems Biology Models Using a Weighted Ensemble of Trajectories

    PubMed Central

    Donovan, Rory M.; Tapia, Jose-Juan; Sullivan, Devin P.; Faeder, James R.; Murphy, Robert F.; Dittrich, Markus; Zuckerman, Daniel M.

    2016-01-01

    The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assigned weights and uses a statistical resampling strategy of replicating and pruning trajectories to focus computational effort on difficult-to-sample regions. The method can also generate unbiased estimates of non-equilibrium and equilibrium observables, sometimes with significantly less aggregate computing time than would be possible using standard parallelization. Here, we use WE to orchestrate particle-based kinetic Monte Carlo simulations, which include spatial geometry (e.g., of organelles, plasma membrane) and biochemical interactions among mobile molecular species. We study a series of models exhibiting spatial, temporal and biochemical complexity and show that although WE has important limitations, it can achieve performance significantly exceeding standard parallel simulation—by orders of magnitude for some observables. PMID:26845334

  14. Benchmarking methods and data sets for ligand enrichment assessment in virtual screening.

    PubMed

    Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

    2015-01-01

    Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. "analogue bias", "artificial enrichment" and "false negative". In addition, we introduce our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylases (HDACs) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The leave-one-out cross-validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased as measured by property matching, ROC curves and AUCs. Copyright © 2014 Elsevier Inc. All rights reserved.

  15. Benchmarking Methods and Data Sets for Ligand Enrichment Assessment in Virtual Screening

    PubMed Central

    Xia, Jie; Tilahun, Ermias Lemma; Reid, Terry-Elinor; Zhang, Liangren; Wang, Xiang Simon

    2014-01-01

    Retrospective small-scale virtual screening (VS) based on benchmarking data sets has been widely used to estimate ligand enrichments of VS approaches in the prospective (i.e. real-world) efforts. However, the intrinsic differences of benchmarking sets to the real screening chemical libraries can cause biased assessment. Herein, we summarize the history of benchmarking methods as well as data sets and highlight three main types of biases found in benchmarking sets, i.e. “analogue bias”, “artificial enrichment” and “false negative”. In addition, we introduced our recent algorithm to build maximum-unbiased benchmarking sets applicable to both ligand-based and structure-based VS approaches, and its implementations to three important human histone deacetylase (HDAC) isoforms, i.e. HDAC1, HDAC6 and HDAC8. The Leave-One-Out Cross-Validation (LOO CV) demonstrates that the benchmarking sets built by our algorithm are maximum-unbiased in terms of property matching, ROC curves and AUCs. PMID:25481478

  16. A Data Analytics Approach to Discovering Unique Microstructural Configurations Susceptible to Fatigue

    NASA Astrophysics Data System (ADS)

    Jha, S. K.; Brockman, R. A.; Hoffman, R. M.; Sinha, V.; Pilchak, A. L.; Porter, W. J.; Buchanan, D. J.; Larsen, J. M.; John, R.

    2018-05-01

    Principal component analysis and fuzzy c-means clustering algorithms were applied to slip-induced strain and geometric metric data in an attempt to discover unique microstructural configurations and their frequencies of occurrence in statistically representative instantiations of a titanium alloy microstructure. Grain-averaged fatigue indicator parameters were calculated for the same instantiation. The fatigue indicator parameters strongly correlated with the spatial location of the microstructural configurations in the principal components space. The fuzzy c-means clustering method identified clusters of data that varied in terms of their average fatigue indicator parameters. Furthermore, the number of points in each cluster was inversely correlated to the average fatigue indicator parameter. This analysis demonstrates that data-driven methods have significant potential for providing unbiased determination of unique microstructural configurations and their frequencies of occurrence in a given volume from the point of view of strain localization and fatigue crack initiation.

  17. Methods for the quantification of coarse woody debris and an examination of its spatial patterning: A study from the Tenderfoot Creek Experimental Forest, MT

    Treesearch

    Paul B. Alaback; Duncan C. Lutes

    1997-01-01

    Methods for the quantification of coarse woody debris volume and the description of spatial patterning were studied in the Tenderfoot Creek Experimental Forest, Montana. The line transect method was found to be an accurate, unbiased estimator of down debris volume (> 10cm diameter) on 1/4 hectare fixed-area plots, when perpendicular lines were used. The Fischer...

  18. Equilibrium Free Energies from Nonequilibrium Metadynamics

    NASA Astrophysics Data System (ADS)

    Bussi, Giovanni; Laio, Alessandro; Parrinello, Michele

    2006-03-01

    In this Letter we propose a new formalism to map history-dependent metadynamics in a Markovian process. We apply this formalism to model Langevin dynamics and determine the equilibrium distribution of a collection of simulations. We demonstrate that the reconstructed free energy is an unbiased estimate of the underlying free energy and analytically derive an expression for the error. The present results can be applied to other history-dependent stochastic processes, such as Wang-Landau sampling.

  19. Online sequential Monte Carlo smoother for partially observed diffusion processes

    NASA Astrophysics Data System (ADS)

    Gloaguen, Pierre; Étienne, Marie-Pierre; Le Corff, Sylvain

    2018-12-01

    This paper introduces a new algorithm to approximate smoothed additive functionals of partially observed diffusion processes. This method relies on a new sequential Monte Carlo method which allows to compute such approximations online, i.e., as the observations are received, and with a computational complexity growing linearly with the number of Monte Carlo samples. The original algorithm cannot be used in the case of partially observed stochastic differential equations since the transition density of the latent data is usually unknown. We prove that it may be extended to partially observed continuous processes by replacing this unknown quantity by an unbiased estimator obtained for instance using general Poisson estimators. This estimator is proved to be consistent and its performance are illustrated using data from two models.

  20. Combating Unmeasured Confounding in Cross-Sectional Studies: Evaluating Instrumental-Variable and Heckman Selection Models

    PubMed Central

    DeMaris, Alfred

    2014-01-01

    Unmeasured confounding is the principal threat to unbiased estimation of treatment “effects” (i.e., regression parameters for binary regressors) in nonexperimental research. It refers to unmeasured characteristics of individuals that lead them both to be in a particular “treatment” category and to register higher or lower values than others on a response variable. In this article, I introduce readers to 2 econometric techniques designed to control the problem, with a particular emphasis on the Heckman selection model (HSM). Both techniques can be used with only cross-sectional data. Using a Monte Carlo experiment, I compare the performance of instrumental-variable regression (IVR) and HSM to that of ordinary least squares (OLS) under conditions with treatment and unmeasured confounding both present and absent. I find HSM generally to outperform IVR with respect to mean-square-error of treatment estimates, as well as power for detecting either a treatment effect or unobserved confounding. However, both HSM and IVR require a large sample to be fully effective. The use of HSM and IVR in tandem with OLS to untangle unobserved confounding bias in cross-sectional data is further demonstrated with an empirical application. Using data from the 2006–2010 General Social Survey (National Opinion Research Center, 2014), I examine the association between being married and subjective well-being. PMID:25110904

  1. Elasticity Imaging of Polymeric Media

    PubMed Central

    Sridhar, Mallika; Liu, Jie; Insana, Michael F.

    2009-01-01

    Viscoelastic properties of soft tissues and hydropolymers depend on the strength of molecular bonding forces connecting the polymer matrix and surrounding fluids. The basis for diagnostic imaging is that disease processes alter molecular-scale bonding in ways that vary the measurable stiffness and viscosity of the tissues. This paper reviews linear viscoelastic theory as applied to gelatin hydrogels for the purpose of formulating approaches to molecular-scale interpretation of elasticity imaging in soft biological tissues. Comparing measurements acquired under different geometries, we investigate the limitations of viscoelastic parameters acquired under various imaging conditions. Quasistatic (step-and-hold and low-frequency harmonic) stimuli applied to gels during creep and stress relaxation experiments in confined and unconfined geometries reveal continuous, bimodal distributions of respondance times. Within the linear range of responses, gelatin will behave more like a solid or fluid depending on the stimulus magnitude. Gelatin can be described statistically from a few parameters of low-order rheological models that form the basis of viscoelastic imaging. Unbiased estimates of imaging parameters are obtained only if creep data are acquired for greater than twice the highest retardance time constant and any steady-state viscous response has been eliminated. Elastic strain and retardance time images are found to provide the best combination of contrast and signal strength in gelatin. Retardance times indicate average behavior of fast (1–10 s) fluid flows and slow (50–400 s) matrix restructuring in response to the mechanical stimulus. Insofar as gelatin mimics other polymers, such as soft biological tissues, elasticity imaging can provide unique insights into complex structural and biochemical features of connectives tissues affected by disease. PMID:17408331

  2. Modelling the maximum voluntary joint torque/angular velocity relationship in human movement.

    PubMed

    Yeadon, Maurice R; King, Mark A; Wilson, Cassie

    2006-01-01

    The force exerted by a muscle is a function of the activation level and the maximum (tetanic) muscle force. In "maximum" voluntary knee extensions muscle activation is lower for eccentric muscle velocities than for concentric velocities. The aim of this study was to model this "differential activation" in order to calculate the maximum voluntary knee extensor torque as a function of knee angular velocity. Torque data were collected on two subjects during maximal eccentric-concentric knee extensions using an isovelocity dynamometer with crank angular velocities ranging from 50 to 450 degrees s(-1). The theoretical tetanic torque/angular velocity relationship was modelled using a four parameter function comprising two rectangular hyperbolas while the activation/angular velocity relationship was modelled using a three parameter function that rose from submaximal activation for eccentric velocities to full activation for high concentric velocities. The product of these two functions gave a seven parameter function which was fitted to the joint torque/angular velocity data, giving unbiased root mean square differences of 1.9% and 3.3% of the maximum torques achieved. Differential activation accounts for the non-hyperbolic behaviour of the torque/angular velocity data for low concentric velocities. The maximum voluntary knee extensor torque that can be exerted may be modelled accurately as the product of functions defining the maximum torque and the maximum voluntary activation level. Failure to include differential activation considerations when modelling maximal movements will lead to errors in the estimation of joint torque in the eccentric phase and low velocity concentric phase.

  3. Estimation of sex-specific survival from capture-recapture data when sex is not always known

    USGS Publications Warehouse

    Nichols, J.D.; Kendall, W.L.; Hines, J.E.; Spendelow, J.A.

    2004-01-01

    Many animals lack obvious sexual dimorphism, making assignment of sex difficult even for observed or captured animals. For many such species it is possible to assign sex with certainty only at some occasions; for example, when they exhibit certain types of behavior. A common approach to handling this situation in capture-recapture studies has been to group capture histories into those of animals eventually identified as male and female and those for which sex was never known. Because group membership is dependent on the number of occasions at which an animal was caught or observed (known sex animals, on average, will have been observed at more occasions than unknown-sex animals), survival estimates for known-sex animals will be positively biased, and those for unknown animals will be negatively biased. In this paper, we develop capture-recapture models that incorporate sex ratio and sex assignment parameters that permit unbiased estimation in the face of this sampling problem. We demonstrate the magnitude of bias in the traditional capture-recapture approach to this sampling problem, and we explore properties of estimators from other ad hoc approaches. The model is then applied to capture-recapture data for adult Roseate Terns (Sterna dougallii) at Falkner Island, Connecticut, 1993-2002. Sex ratio among adults in this population favors females, and we tested the hypothesis that this population showed sex-specific differences in adult survival. Evidence was provided for higher survival of adult females than males, as predicted. We recommend use of this modeling approach for future capture-recapture studies in which sex cannot always be assigned to captured or observed animals. We also place this problem in the more general context of uncertainty in state classification in multistate capture-recapture models.

  4. Genomic Model with Correlation Between Additive and Dominance Effects.

    PubMed

    Xiang, Tao; Christensen, Ole Fredslund; Vitezica, Zulma Gladis; Legarra, Andres

    2018-05-09

    Dominance genetic effects are rarely included in pedigree-based genetic evaluation. With the availability of single nucleotide polymorphism markers and the development of genomic evaluation, estimates of dominance genetic effects have become feasible using genomic best linear unbiased prediction (GBLUP). Usually, studies involving additive and dominance genetic effects ignore possible relationships between them. It has been often suggested that the magnitude of functional additive and dominance effects at the quantitative trait loci are related, but there is no existing GBLUP-like approach accounting for such correlation. Wellmann and Bennewitz showed two ways of considering directional relationships between additive and dominance effects, which they estimated in a Bayesian framework. However, these relationships cannot be fitted at the level of individuals instead of loci in a mixed model and are not compatible with standard animal or plant breeding software. This comes from a fundamental ambiguity in assigning the reference allele at a given locus. We show that, if there has been selection, assigning the most frequent as the reference allele orients the correlation between functional additive and dominance effects. As a consequence, the most frequent reference allele is expected to have a positive value. We also demonstrate that selection creates negative covariance between genotypic additive and dominance genetic values. For parameter estimation, it is possible to use a combined additive and dominance relationship matrix computed from marker genotypes, and to use standard restricted maximum likelihood (REML) algorithms based on an equivalent model. Through a simulation study, we show that such correlations can easily be estimated by mixed model software and accuracy of prediction for genetic values is slightly improved if such correlations are used in GBLUP. However, a model assuming uncorrelated effects and fitting orthogonal breeding values and dominant deviations performed similarly for prediction. Copyright © 2018, Genetics.

  5. Estimating surface soil moisture from SMAP observations using a Neural Network technique.

    PubMed

    Kolassa, J; Reichle, R H; Liu, Q; Alemohammad, S H; Gentine, P; Aida, K; Asanuma, J; Bircher, S; Caldwell, T; Colliander, A; Cosh, M; Collins, C Holifield; Jackson, T J; Martínez-Fernández, J; McNairn, H; Pacheco, A; Thibeault, M; Walker, J P

    2018-01-01

    A Neural Network (NN) algorithm was developed to estimate global surface soil moisture for April 2015 to March 2017 with a 2-3 day repeat frequency using passive microwave observations from the Soil Moisture Active Passive (SMAP) satellite, surface soil temperatures from the NASA Goddard Earth Observing System Model version 5 (GEOS-5) land modeling system, and Moderate Resolution Imaging Spectroradiometer-based vegetation water content. The NN was trained on GEOS-5 soil moisture target data, making the NN estimates consistent with the GEOS-5 climatology, such that they may ultimately be assimilated into this model without further bias correction. Evaluated against in situ soil moisture measurements, the average unbiased root mean square error (ubRMSE), correlation and anomaly correlation of the NN retrievals were 0.037 m 3 m -3 , 0.70 and 0.66, respectively, against SMAP core validation site measurements and 0.026 m 3 m -3 , 0.58 and 0.48, respectively, against International Soil Moisture Network (ISMN) measurements. At the core validation sites, the NN retrievals have a significantly higher skill than the GEOS-5 model estimates and a slightly lower correlation skill than the SMAP Level-2 Passive (L2P) product. The feasibility of the NN method was reflected by a lower ubRMSE compared to the L2P retrievals as well as a higher skill when ancillary parameters in physically-based retrievals were uncertain. Against ISMN measurements, the skill of the two retrieval products was more comparable. A triple collocation analysis against Advanced Microwave Scanning Radiometer 2 (AMSR2) and Advanced Scatterometer (ASCAT) soil moisture retrievals showed that the NN and L2P retrieval errors have a similar spatial distribution, but the NN retrieval errors are generally lower in densely vegetated regions and transition zones.

  6. Statistical properties of the anomalous scaling exponent estimator based on time-averaged mean-square displacement

    NASA Astrophysics Data System (ADS)

    Sikora, Grzegorz; Teuerle, Marek; Wyłomańska, Agnieszka; Grebenkov, Denis

    2017-08-01

    The most common way of estimating the anomalous scaling exponent from single-particle trajectories consists of a linear fit of the dependence of the time-averaged mean-square displacement on the lag time at the log-log scale. We investigate the statistical properties of this estimator in the case of fractional Brownian motion (FBM). We determine the mean value, the variance, and the distribution of the estimator. Our theoretical results are confirmed by Monte Carlo simulations. In the limit of long trajectories, the estimator is shown to be asymptotically unbiased, consistent, and with vanishing variance. These properties ensure an accurate estimation of the scaling exponent even from a single (long enough) trajectory. As a consequence, we prove that the usual way to estimate the diffusion exponent of FBM is correct from the statistical point of view. Moreover, the knowledge of the estimator distribution is the first step toward new statistical tests of FBM and toward a more reliable interpretation of the experimental histograms of scaling exponents in microbiology.

  7. A statistical test of unbiased evolution of body size in birds.

    PubMed

    Bokma, Folmer

    2002-12-01

    Of the approximately 9500 bird species, the vast majority is small-bodied. That is a general feature of evolutionary lineages, also observed for instance in mammals and plants. The avian interspecific body size distribution is right-skewed even on a logarithmic scale. That has previously been interpreted as evidence that body size evolution has been biased. However, a procedure to test for unbiased evolution from the shape of body size distributions was lacking. In the present paper unbiased body size evolution is defined precisely, and a statistical test is developed based on Monte Carlo simulation of unbiased evolution. Application of the test to birds suggests that it is highly unlikely that avian body size evolution has been unbiased as defined. Several possible explanations for this result are discussed. A plausible explanation is that the general model of unbiased evolution assumes that population size and generation time do not affect the evolutionary variability of body size; that is, that micro- and macroevolution are decoupled, which theory suggests is not likely to be the case.

  8. Neighborhood Effects in a Behavioral Randomized Controlled Trial

    PubMed Central

    Pruitt, Sandi L.; Leonard, Tammy; Murdoch, James; Hughes, Amy; McQueen, Amy; Gupta, Samir

    2015-01-01

    Randomized controlled trials (RCTs) of interventions intended to modify health behaviors may be influenced by neighborhood effects which can impede unbiased estimation of intervention effects. Examining a RCT designed to increase colorectal cancer (CRC) screening (N=5,628), we found statistically significant neighborhood effects: average CRC test use among neighboring study participants was significantly and positively associated with individual patient’s CRC test use. This potentially important spatially-varying covariate has not previously been considered in a RCT. Our results suggest that future RCTs of health behavior interventions should assess potential social interactions between participants, which may cause intervention arm contamination and may bias effect size estimation. PMID:25456014

  9. Cramer-Rao Bound for Gaussian Random Processes and Applications to Radar Processing of Atmospheric Signals

    NASA Technical Reports Server (NTRS)

    Frehlich, Rod

    1993-01-01

    Calculations of the exact Cramer-Rao Bound (CRB) for unbiased estimates of the mean frequency, signal power, and spectral width of Doppler radar/lidar signals (a Gaussian random process) are presented. Approximate CRB's are derived using the Discrete Fourier Transform (DFT). These approximate results are equal to the exact CRB when the DFT coefficients are mutually uncorrelated. Previous high SNR limits for CRB's are shown to be inaccurate because the discrete summations cannot be approximated with integration. The performance of an approximate maximum likelihood estimator for mean frequency approaches the exact CRB for moderate signal to noise ratio and moderate spectral width.

  10. Can real time location system technology (RTLS) provide useful estimates of time use by nursing personnel?

    PubMed

    Jones, Terry L; Schlegel, Cara

    2014-02-01

    Accurate, precise, unbiased, reliable, and cost-effective estimates of nursing time use are needed to insure safe staffing levels. Direct observation of nurses is costly, and conventional surrogate measures have limitations. To test the potential of electronic capture of time and motion through real time location systems (RTLS), a pilot study was conducted to assess efficacy (method agreement) of RTLS time use; inter-rater reliability of RTLS time-use estimates; and associated costs. Method agreement was high (mean absolute difference = 28 seconds); inter-rater reliability was high (ICC = 0.81-0.95; mean absolute difference = 2 seconds); and costs for obtaining RTLS time-use estimates on a single nursing unit exceeded $25,000. Continued experimentation with RTLS to obtain time-use estimates for nursing staff is warranted. © 2013 Wiley Periodicals, Inc.

  11. A robust pseudo-inverse spectral filter applied to the Earth Radiation Budget Experiment (ERBE) scanning channels

    NASA Technical Reports Server (NTRS)

    Avis, L. M.; Green, R. N.; Suttles, J. T.; Gupta, S. K.

    1984-01-01

    Computer simulations of a least squares estimator operating on the ERBE scanning channels are discussed. The estimator is designed to minimize the errors produced by nonideal spectral response to spectrally varying and uncertain radiant input. The three ERBE scanning channels cover a shortwave band a longwave band and a ""total'' band from which the pseudo inverse spectral filter estimates the radiance components in the shortwave band and a longwave band. The radiance estimator draws on instantaneous field of view (IFOV) scene type information supplied by another algorithm of the ERBE software, and on a priori probabilistic models of the responses of the scanning channels to the IFOV scene types for given Sun scene spacecraft geometry. It is found that the pseudoinverse spectral filter is stable, tolerant of errors in scene identification and in channel response modeling, and, in the absence of such errors, yields minimum variance and essentially unbiased radiance estimates.

  12. Chasing the peak: optimal statistics for weak shear analyses

    NASA Astrophysics Data System (ADS)

    Smit, Merijn; Kuijken, Konrad

    2018-01-01

    Context. Weak gravitational lensing analyses are fundamentally limited by the intrinsic distribution of galaxy shapes. It is well known that this distribution of galaxy ellipticity is non-Gaussian, and the traditional estimation methods, explicitly or implicitly assuming Gaussianity, are not necessarily optimal. Aims: We aim to explore alternative statistics for samples of ellipticity measurements. An optimal estimator needs to be asymptotically unbiased, efficient, and robust in retaining these properties for various possible sample distributions. We take the non-linear mapping of gravitational shear and the effect of noise into account. We then discuss how the distribution of individual galaxy shapes in the observed field of view can be modeled by fitting Fourier modes to the shear pattern directly. This allows scientific analyses using statistical information of the whole field of view, instead of locally sparse and poorly constrained estimates. Methods: We simulated samples of galaxy ellipticities, using both theoretical distributions and data for ellipticities and noise. We determined the possible bias Δe, the efficiency η and the robustness of the least absolute deviations, the biweight, and the convex hull peeling (CHP) estimators, compared to the canonical weighted mean. Using these statistics for regression, we have shown the applicability of direct Fourier mode fitting. Results: We find an improved performance of all estimators, when iteratively reducing the residuals after de-shearing the ellipticity samples by the estimated shear, which removes the asymmetry in the ellipticity distributions. We show that these estimators are then unbiased in the absence of noise, and decrease noise bias by more than 30%. Our results show that the CHP estimator distribution is skewed, but still centered around the underlying shear, and its bias least affected by noise. We find the least absolute deviations estimator to be the most efficient estimator in almost all cases, except in the Gaussian case, where it's still competitive (0.83 < η < 5.1) and therefore robust. These results hold when fitting Fourier modes, where amplitudes of variation in ellipticity are determined to the order of 10-3. Conclusions: The peak of the ellipticity distribution is a direct tracer of the underlying shear and unaffected by noise, and we have shown that estimators that are sensitive to a central cusp perform more efficiently, potentially reducing uncertainties by more 0% and significantly decreasing noise bias. These results become increasingly important, as survey sizes increase and systematic issues in shape measurements decrease.

  13. TRIPPy: Python-based Trailed Source Photometry

    NASA Astrophysics Data System (ADS)

    Fraser, Wesley C.; Alexandersen, Mike; Schwamb, Megan E.; Marsset, Michael E.; Pike, Rosemary E.; Kavelaars, JJ; Bannister, Michele T.; Benecchi, Susan; Delsanti, Audrey

    2016-05-01

    TRIPPy (TRailed Image Photometry in Python) uses a pill-shaped aperture, a rectangle described by three parameters (trail length, angle, and radius) to improve photometry of moving sources over that done with circular apertures. It can generate accurate model and trailed point-spread functions from stationary background sources in sidereally tracked images. Appropriate aperture correction provides accurate, unbiased flux measurement. TRIPPy requires numpy, scipy, matplotlib, Astropy (ascl:1304.002), and stsci.numdisplay; emcee (ascl:1303.002) and SExtractor (ascl:1010.064) are optional.

  14. Genome-based prediction of test cross performance in two subsequent breeding cycles.

    PubMed

    Hofheinz, Nina; Borchardt, Dietrich; Weissleder, Knuth; Frisch, Matthias

    2012-12-01

    Genome-based prediction of genetic values is expected to overcome shortcomings that limit the application of QTL mapping and marker-assisted selection in plant breeding. Our goal was to study the genome-based prediction of test cross performance with genetic effects that were estimated using genotypes from the preceding breeding cycle. In particular, our objectives were to employ a ridge regression approach that approximates best linear unbiased prediction of genetic effects, compare cross validation with validation using genetic material of the subsequent breeding cycle, and investigate the prospects of genome-based prediction in sugar beet breeding. We focused on the traits sugar content and standard molasses loss (ML) and used a set of 310 sugar beet lines to estimate genetic effects at 384 SNP markers. In cross validation, correlations >0.8 between observed and predicted test cross performance were observed for both traits. However, in validation with 56 lines from the next breeding cycle, a correlation of 0.8 could only be observed for sugar content, for standard ML the correlation reduced to 0.4. We found that ridge regression based on preliminary estimates of the heritability provided a very good approximation of best linear unbiased prediction and was not accompanied with a loss in prediction accuracy. We conclude that prediction accuracy assessed with cross validation within one cycle of a breeding program can not be used as an indicator for the accuracy of predicting lines of the next cycle. Prediction of lines of the next cycle seems promising for traits with high heritabilities.

  15. Estimating hazard ratios in cohort data with missing disease information due to death.

    PubMed

    Binder, Nadine; Herrnböck, Anne-Sophie; Schumacher, Martin

    2017-03-01

    In clinical and epidemiological studies information on the primary outcome of interest, that is, the disease status, is usually collected at a limited number of follow-up visits. The disease status can often only be retrieved retrospectively in individuals who are alive at follow-up, but will be missing for those who died before. Right-censoring the death cases at the last visit (ad-hoc analysis) yields biased hazard ratio estimates of a potential risk factor, and the bias can be substantial and occur in either direction. In this work, we investigate three different approaches that use the same likelihood contributions derived from an illness-death multistate model in order to more adequately estimate the hazard ratio by including the death cases into the analysis: a parametric approach, a penalized likelihood approach, and an imputation-based approach. We investigate to which extent these approaches allow for an unbiased regression analysis by evaluating their performance in simulation studies and on a real data example. In doing so, we use the full cohort with complete illness-death data as reference and artificially induce missing information due to death by setting discrete follow-up visits. Compared to an ad-hoc analysis, all considered approaches provide less biased or even unbiased results, depending on the situation studied. In the real data example, the parametric approach is seen to be too restrictive, whereas the imputation-based approach could almost reconstruct the original event history information. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. How to count cells: the advantages and disadvantages of the isotropic fractionator compared with stereology.

    PubMed

    Herculano-Houzel, Suzana; von Bartheld, Christopher S; Miller, Daniel J; Kaas, Jon H

    2015-04-01

    The number of cells comprising biological structures represents fundamental information in basic anatomy, development, aging, drug tests, pathology and genetic manipulations. Obtaining unbiased estimates of cell numbers, however, was until recently possible only through stereological techniques, which require specific training, equipment, histological processing and appropriate sampling strategies applied to structures with a homogeneous distribution of cell bodies. An alternative, the isotropic fractionator (IF), became available in 2005 as a fast and inexpensive method that requires little training, no specific software and only a few materials before it can be used to quantify total numbers of neuronal and non-neuronal cells in a whole organ such as the brain or any dissectible regions thereof. This method entails transforming a highly anisotropic tissue into a homogeneous suspension of free-floating nuclei that can then be counted under the microscope or by flow cytometry and identified morphologically and immunocytochemically as neuronal or non-neuronal. We compare the advantages and disadvantages of each method and provide researchers with guidelines for choosing the best method for their particular needs. IF is as accurate as unbiased stereology and faster than stereological techniques, as it requires no elaborate histological processing or sampling paradigms, providing reliable estimates in a few days rather than many weeks. Tissue shrinkage is also not an issue, since the estimates provided are independent of tissue volume. The main disadvantage of IF, however, is that it necessarily destroys the tissue analyzed and thus provides no spatial information on the cellular composition of biological regions of interest.

  17. Characterization of background concentrations of contaminants using a mixture of normal distributions.

    PubMed

    Qian, Song S; Lyons, Regan E

    2006-10-01

    We present a Bayesian approach for characterizing background contaminant concentration distributions using data from sites that may have been contaminated. Our method, focused on estimation, resolves several technical problems of the existing methods sanctioned by the U.S. Environmental Protection Agency (USEPA) (a hypothesis testing based method), resulting in a simple and quick procedure for estimating background contaminant concentrations. The proposed Bayesian method is applied to two data sets from a federal facility regulated under the Resource Conservation and Restoration Act. The results are compared to background distributions identified using existing methods recommended by the USEPA. The two data sets represent low and moderate levels of censorship in the data. Although an unbiased estimator is elusive, we show that the proposed Bayesian estimation method will have a smaller bias than the EPA recommended method.

  18. Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling

    NASA Astrophysics Data System (ADS)

    Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

    2017-07-01

    What is the "best" model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, Mk; k=>(1,…,K>), and help identify which model is most supported by the observed data, Y>˜=>(y˜1,…,y˜n>). Here, we introduce a new and robust estimator of the model evidence, p>(Y>˜|Mk>), which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, p>(Y>˜|Mk>) is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of p>(Y>˜|Mk>) and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection.

  19. Automated daily processing of more than 1000 ground-based GPS receivers for studying intense ionospheric storms

    NASA Astrophysics Data System (ADS)

    Komjathy, Attila; Sparks, Lawrence; Wilson, Brian D.; Mannucci, Anthony J.

    2005-12-01

    As the number of ground-based and space-based receivers tracking the Global Positioning System (GPS) satellites steadily increases, it is becoming possible to monitor changes in the ionosphere continuously and on a global scale with unprecedented accuracy and reliability. As of August 2005, there are more than 1000 globally distributed dual-frequency GPS receivers available using publicly accessible networks including, for example, the International GPS Service and the continuously operating reference stations. To take advantage of the vast amount of GPS data, researchers use a number of techniques to estimate satellite and receiver interfrequency biases and the total electron content (TEC) of the ionosphere. Most techniques estimate vertical ionospheric structure and, simultaneously, hardware-related biases treated as nuisance parameters. These methods often are limited to 200 GPS receivers and use a sequential least squares or Kalman filter approach. The biases are later removed from the measurements to obtain unbiased TEC. In our approach to calibrating GPS receiver and transmitter interfrequency biases we take advantage of all available GPS receivers using a new processing algorithm based on the Global Ionospheric Mapping (GIM) software developed at the Jet Propulsion Laboratory. This new capability is designed to estimate receiver biases for all stations. We solve for the instrumental biases by modeling the ionospheric delay and removing it from the observation equation using precomputed GIM maps. The precomputed GIM maps rely on 200 globally distributed GPS receivers to establish the "background" used to model the ionosphere at the remaining 800 GPS sites.

  20. Multilocus patterns of polymorphism and selection across the X chromosome of Caenorhabditis remanei.

    PubMed

    Cutter, Asher D

    2008-03-01

    Natural selection and neutral processes such as demography, mutation, and gene conversion all contribute to patterns of polymorphism within genomes. Identifying the relative importance of these varied components in evolution provides the principal challenge for population genetics. To address this issue in the nematode Caenorhabditis remanei, I sampled nucleotide polymorphism at 40 loci across the X chromosome. The site-frequency spectrum for these loci provides no evidence for population size change, and one locus presents a candidate for linkage to a target of balancing selection. Selection for codon usage bias leads to the non-neutrality of synonymous sites, and despite its weak magnitude of effect (N(e)s approximately 0.1), is responsible for profound patterns of diversity and divergence in the C. remanei genome. Although gene conversion is evident for many loci, biased gene conversion is not identified as a significant evolutionary process in this sample. No consistent association is observed between synonymous-site diversity and linkage-disequilibrium-based estimators of the population recombination parameter, despite theoretical predictions about background selection or widespread genetic hitchhiking, but genetic map-based estimates of recombination are needed to rigorously test for a diversity-recombination relationship. Coalescent simulations also illustrate how a spurious correlation between diversity and linkage-disequilibrium-based estimators of recombination can occur, due in part to the presence of unbiased gene conversion. These results illustrate the influence that subtle natural selection can exert on polymorphism and divergence, in the form of codon usage bias, and demonstrate the potential of C. remanei for detecting natural selection from genomic scans of polymorphism.

Top