Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown
ERIC Educational Resources Information Center
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi
2014-01-01
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pražnikar, Jure; University of Primorska,; Turk, Dušan, E-mail: dusan.turk@ijs.si
2014-12-01
The maximum-likelihood free-kick target, which calculates model error estimates from the work set and a randomly displaced model, proved superior in the accuracy and consistency of refinement of crystal structures compared with the maximum-likelihood cross-validation target, which calculates error estimates from the test set and the unperturbed model. The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. Theymore » utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of R{sub free} or may leave it out completely.« less
Tests for detecting overdispersion in models with measurement error in covariates.
Yang, Yingsi; Wong, Man Yu
2015-11-30
Measurement error in covariates can affect the accuracy in count data modeling and analysis. In overdispersion identification, the true mean-variance relationship can be obscured under the influence of measurement error in covariates. In this paper, we propose three tests for detecting overdispersion when covariates are measured with error: a modified score test and two score tests based on the proposed approximate likelihood and quasi-likelihood, respectively. The proposed approximate likelihood is derived under the classical measurement error model, and the resulting approximate maximum likelihood estimator is shown to have superior efficiency. Simulation results also show that the score test based on approximate likelihood outperforms the test based on quasi-likelihood and other alternatives in terms of empirical power. By analyzing a real dataset containing the health-related quality-of-life measurements of a particular group of patients, we demonstrate the importance of the proposed methods by showing that the analyses with and without measurement error correction yield significantly different results. Copyright © 2015 John Wiley & Sons, Ltd.
Investigating the Impact of Uncertainty about Item Parameters on Ability Estimation
ERIC Educational Resources Information Center
Zhang, Jinming; Xie, Minge; Song, Xiaolan; Lu, Ting
2011-01-01
Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimator (WLE) of an examinee's ability are derived while item parameter estimators are treated as covariates measured with error. The asymptotic formulae present the amount of bias of the ability estimators due to the uncertainty of item parameter estimators.…
Gaussian copula as a likelihood function for environmental models
NASA Astrophysics Data System (ADS)
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an interesting departure from the usage of fully parametric distributions as likelihood functions - and they could help us to better capture the statistical properties of errors and make more reliable predictions.
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data
ERIC Educational Resources Information Center
Savalei, Victoria
2010-01-01
Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
An EM Algorithm for Maximum Likelihood Estimation of Process Factor Analysis Models
ERIC Educational Resources Information Center
Lee, Taehun
2010-01-01
In this dissertation, an Expectation-Maximization (EM) algorithm is developed and implemented to obtain maximum likelihood estimates of the parameters and the associated standard error estimates characterizing temporal flows for the latent variable time series following stationary vector ARMA processes, as well as the parameters defining the…
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-06-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Nonparametric probability density estimation by optimization theoretic techniques
NASA Technical Reports Server (NTRS)
Scott, D. W.
1976-01-01
Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
A comparison of abundance estimates from extended batch-marking and Jolly–Seber-type experiments
Cowen, Laura L E; Besbeas, Panagiotis; Morgan, Byron J T; Schwarz, Carl J
2014-01-01
Little attention has been paid to the use of multi-sample batch-marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. (2010) present a pseudo-likelihood for a multi-sample batch-marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson-type estimator. We have developed and maximized the likelihood for batch-marking studies. We use data simulated from a Jolly–Seber-type study and convert this to what would have been obtained from an extended batch-marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch-marking model to determine the efficiency of collecting and analyzing batch-marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo-likelihood method of Huggins et al. (2010). When faced with designing a batch-marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size. PMID:24558576
An Investigation of the Standard Errors of Expected A Posteriori Ability Estimates.
ERIC Educational Resources Information Center
De Ayala, R. J.; And Others
Expected a posteriori has a number of advantages over maximum likelihood estimation or maximum a posteriori (MAP) estimation methods. These include ability estimates (thetas) for all response patterns, less regression towards the mean than MAP ability estimates, and a lower average squared error. R. D. Bock and R. J. Mislevy (1982) state that the…
Development of advanced techniques for rotorcraft state estimation and parameter identification
NASA Technical Reports Server (NTRS)
Hall, W. E., Jr.; Bohn, J. G.; Vincent, J. H.
1980-01-01
An integrated methodology for rotorcraft system identification consists of rotorcraft mathematical modeling, three distinct data processing steps, and a technique for designing inputs to improve the identifiability of the data. These elements are as follows: (1) a Kalman filter smoother algorithm which estimates states and sensor errors from error corrupted data. Gust time histories and statistics may also be estimated; (2) a model structure estimation algorithm for isolating a model which adequately explains the data; (3) a maximum likelihood algorithm for estimating the parameters and estimates for the variance of these estimates; and (4) an input design algorithm, based on a maximum likelihood approach, which provides inputs to improve the accuracy of parameter estimates. Each step is discussed with examples to both flight and simulated data cases.
NASA Astrophysics Data System (ADS)
Cheng, Qin-Bo; Chen, Xi; Xu, Chong-Yu; Reinhardt-Imjela, Christian; Schulte, Achim
2014-11-01
In this study, the likelihood functions for uncertainty analysis of hydrological models are compared and improved through the following steps: (1) the equivalent relationship between the Nash-Sutcliffe Efficiency coefficient (NSE) and the likelihood function with Gaussian independent and identically distributed residuals is proved; (2) a new estimation method of the Box-Cox transformation (BC) parameter is developed to improve the effective elimination of the heteroscedasticity of model residuals; and (3) three likelihood functions-NSE, Generalized Error Distribution with BC (BC-GED) and Skew Generalized Error Distribution with BC (BC-SGED)-are applied for SWAT-WB-VSA (Soil and Water Assessment Tool - Water Balance - Variable Source Area) model calibration in the Baocun watershed, Eastern China. Performances of calibrated models are compared using the observed river discharges and groundwater levels. The result shows that the minimum variance constraint can effectively estimate the BC parameter. The form of the likelihood function significantly impacts on the calibrated parameters and the simulated results of high and low flow components. SWAT-WB-VSA with the NSE approach simulates flood well, but baseflow badly owing to the assumption of Gaussian error distribution, where the probability of the large error is low, but the small error around zero approximates equiprobability. By contrast, SWAT-WB-VSA with the BC-GED or BC-SGED approach mimics baseflow well, which is proved in the groundwater level simulation. The assumption of skewness of the error distribution may be unnecessary, because all the results of the BC-SGED approach are nearly the same as those of the BC-GED approach.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Wald Sequential Probability Ratio Test for Analysis of Orbital Conjunction Data
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis; Gold, Dara
2013-01-01
We propose a Wald Sequential Probability Ratio Test for analysis of commonly available predictions associated with spacecraft conjunctions. Such predictions generally consist of a relative state and relative state error covariance at the time of closest approach, under the assumption that prediction errors are Gaussian. We show that under these circumstances, the likelihood ratio of the Wald test reduces to an especially simple form, involving the current best estimate of collision probability, and a similar estimate of collision probability that is based on prior assumptions about the likelihood of collision.
Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis
van de Schoot, Rens; Hox, Joop
2014-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions. PMID:29795827
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
NASA Technical Reports Server (NTRS)
Battin, R. H.; Croopnick, S. R.; Edwards, J. A.
1977-01-01
The formulation of a recursive maximum likelihood navigation system employing reference position and velocity vectors as state variables is presented. Convenient forms of the required variational equations of motion are developed together with an explicit form of the associated state transition matrix needed to refer measurement data from the measurement time to the epoch time. Computational advantages accrue from this design in that the usual forward extrapolation of the covariance matrix of estimation errors can be avoided without incurring unacceptable system errors. Simulation data for earth orbiting satellites are provided to substantiate this assertion.
New estimates of the CMB angular power spectra from the WMAP 5 year low-resolution data
NASA Astrophysics Data System (ADS)
Gruppuso, A.; de Rosa, A.; Cabella, P.; Paci, F.; Finelli, F.; Natoli, P.; de Gasperis, G.; Mandolesi, N.
2009-11-01
A quadratic maximum likelihood (QML) estimator is applied to the Wilkinson Microwave Anisotropy Probe (WMAP) 5 year low-resolution maps to compute the cosmic microwave background angular power spectra (APS) at large scales for both temperature and polarization. Estimates and error bars for the six APS are provided up to l = 32 and compared, when possible, to those obtained by the WMAP team, without finding any inconsistency. The conditional likelihood slices are also computed for the Cl of all the six power spectra from l = 2 to 10 through a pixel-based likelihood code. Both the codes treat the covariance for (T, Q, U) in a single matrix without employing any approximation. The inputs of both the codes (foreground-reduced maps, related covariances and masks) are provided by the WMAP team. The peaks of the likelihood slices are always consistent with the QML estimates within the error bars; however, an excellent agreement occurs when the QML estimates are used as a fiducial power spectrum instead of the best-fitting theoretical power spectrum. By the full computation of the conditional likelihood on the estimated spectra, the value of the temperature quadrupole CTTl=2 is found to be less than 2σ away from the WMAP 5 year Λ cold dark matter best-fitting value. The BB spectrum is found to be well consistent with zero, and upper limits on the B modes are provided. The parity odd signals TB and EB are found to be consistent with zero.
Maximum Likelihood Time-of-Arrival Estimation of Optical Pulses via Photon-Counting Photodetectors
NASA Technical Reports Server (NTRS)
Erkmen, Baris I.; Moision, Bruce E.
2010-01-01
Many optical imaging, ranging, and communications systems rely on the estimation of the arrival time of an optical pulse. Recently, such systems have been increasingly employing photon-counting photodetector technology, which changes the statistics of the observed photocurrent. This requires time-of-arrival estimators to be developed and their performances characterized. The statistics of the output of an ideal photodetector, which are well modeled as a Poisson point process, were considered. An analytical model was developed for the mean-square error of the maximum likelihood (ML) estimator, demonstrating two phenomena that cause deviations from the minimum achievable error at low signal power. An approximation was derived to the threshold at which the ML estimator essentially fails to provide better than a random guess of the pulse arrival time. Comparing the analytic model performance predictions to those obtained via simulations, it was verified that the model accurately predicts the ML performance over all regimes considered. There is little prior art that attempts to understand the fundamental limitations to time-of-arrival estimation from Poisson statistics. This work establishes both a simple mathematical description of the error behavior, and the associated physical processes that yield this behavior. Previous work on mean-square error characterization for ML estimators has predominantly focused on additive Gaussian noise. This work demonstrates that the discrete nature of the Poisson noise process leads to a distinctly different error behavior.
NASA Astrophysics Data System (ADS)
Kojima, Yohei; Takeda, Kazuaki; Adachi, Fumiyuki
Frequency-domain equalization (FDE) based on the minimum mean square error (MMSE) criterion can provide better downlink bit error rate (BER) performance of direct sequence code division multiple access (DS-CDMA) than the conventional rake combining in a frequency-selective fading channel. FDE requires accurate channel estimation. In this paper, we propose a new 2-step maximum likelihood channel estimation (MLCE) for DS-CDMA with FDE in a very slow frequency-selective fading environment. The 1st step uses the conventional pilot-assisted MMSE-CE and the 2nd step carries out the MLCE using decision feedback from the 1st step. The BER performance improvement achieved by 2-step MLCE over pilot assisted MMSE-CE is confirmed by computer simulation.
Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers
Runkel, Robert L.; Crawford, Charles G.; Cohn, Timothy A.
2004-01-01
LOAD ESTimator (LOADEST) is a FORTRAN program for estimating constituent loads in streams and rivers. Given a time series of streamflow, additional data variables, and constituent concentration, LOADEST assists the user in developing a regression model for the estimation of constituent load (calibration). Explanatory variables within the regression model include various functions of streamflow, decimal time, and additional user-specified data variables. The formulated regression model then is used to estimate loads over a user-specified time interval (estimation). Mean load estimates, standard errors, and 95 percent confidence intervals are developed on a monthly and(or) seasonal basis. The calibration and estimation procedures within LOADEST are based on three statistical estimation methods. The first two methods, Adjusted Maximum Likelihood Estimation (AMLE) and Maximum Likelihood Estimation (MLE), are appropriate when the calibration model errors (residuals) are normally distributed. Of the two, AMLE is the method of choice when the calibration data set (time series of streamflow, additional data variables, and concentration) contains censored data. The third method, Least Absolute Deviation (LAD), is an alternative to maximum likelihood estimation when the residuals are not normally distributed. LOADEST output includes diagnostic tests and warnings to assist the user in determining the appropriate estimation method and in interpreting the estimated loads. This report describes the development and application of LOADEST. Sections of the report describe estimation theory, input/output specifications, sample applications, and installation instructions.
NASA Astrophysics Data System (ADS)
Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho
2017-03-01
So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.
NASA Astrophysics Data System (ADS)
Nourali, Mahrouz; Ghahraman, Bijan; Pourreza-Bilondi, Mohsen; Davary, Kamran
2016-09-01
In the present study, DREAM(ZS), Differential Evolution Adaptive Metropolis combined with both formal and informal likelihood functions, is used to investigate uncertainty of parameters of the HEC-HMS model in Tamar watershed, Golestan province, Iran. In order to assess the uncertainty of 24 parameters used in HMS, three flood events were used to calibrate and one flood event was used to validate the posterior distributions. Moreover, performance of seven different likelihood functions (L1-L7) was assessed by means of DREAM(ZS)approach. Four likelihood functions, L1-L4, Nash-Sutcliffe (NS) efficiency, Normalized absolute error (NAE), Index of agreement (IOA), and Chiew-McMahon efficiency (CM), is considered as informal, whereas remaining (L5-L7) is represented in formal category. L5 focuses on the relationship between the traditional least squares fitting and the Bayesian inference, and L6, is a hetereoscedastic maximum likelihood error (HMLE) estimator. Finally, in likelihood function L7, serial dependence of residual errors is accounted using a first-order autoregressive (AR) model of the residuals. According to the results, sensitivities of the parameters strongly depend on the likelihood function, and vary for different likelihood functions. Most of the parameters were better defined by formal likelihood functions L5 and L7 and showed a high sensitivity to model performance. Posterior cumulative distributions corresponding to the informal likelihood functions L1, L2, L3, L4 and the formal likelihood function L6 are approximately the same for most of the sub-basins, and these likelihood functions depict almost a similar effect on sensitivity of parameters. 95% total prediction uncertainty bounds bracketed most of the observed data. Considering all the statistical indicators and criteria of uncertainty assessment, including RMSE, KGE, NS, P-factor and R-factor, results showed that DREAM(ZS) algorithm performed better under formal likelihood functions L5 and L7, but likelihood function L5 may result in biased and unreliable estimation of parameters due to violation of the residualerror assumptions. Thus, likelihood function L7 provides posterior distribution of model parameters credibly and therefore can be employed for further applications.
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
Profile-likelihood Confidence Intervals in Item Response Theory Models.
Chalmers, R Philip; Pek, Jolynn; Liu, Yang
2017-01-01
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
Maximum likelihood phase-retrieval algorithm: applications.
Nahrstedt, D A; Southwell, W H
1984-12-01
The maximum likelihood estimator approach is shown to be effective in determining the wave front aberration in systems involving laser and flow field diagnostics and optical testing. The robustness of the algorithm enables convergence even in cases of severe wave front error and real, nonsymmetrical, obscured amplitude distributions.
SEPARABLE FACTOR ANALYSIS WITH APPLICATIONS TO MORTALITY DATA
Fosdick, Bailey K.; Hoff, Peter D.
2014-01-01
Human mortality data sets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to account for other dependencies can lead to inefficient estimates of regression parameters, inaccurate standard errors and poor predictions. An alternative to assuming independent errors is to allow for dependence along each dimension of the array using a separable covariance model. However, the number of parameters in this model increases rapidly with the dimensions of the array and, for many arrays, maximum likelihood estimates of the covariance parameters do not exist. In this paper, we propose a submodel of the separable covariance model that estimates the covariance matrix for each dimension as having factor analytic structure. This model can be viewed as an extension of factor analysis to array-valued data, as it uses a factor model to estimate the covariance along each dimension of the array. We discuss properties of this model as they relate to ordinary factor analysis, describe maximum likelihood and Bayesian estimation methods, and provide a likelihood ratio testing procedure for selecting the factor model ranks. We apply this methodology to the analysis of data from the Human Mortality Database, and show in a cross-validation experiment how it outperforms simpler methods. Additionally, we use this model to impute mortality rates for countries that have no mortality data for several years. Unlike other approaches, our methodology is able to estimate similarities between the mortality rates of countries, time periods and sexes, and use this information to assist with the imputations. PMID:25489353
Closed-loop carrier phase synchronization techniques motivated by likelihood functions
NASA Technical Reports Server (NTRS)
Tsou, H.; Hinedi, S.; Simon, M.
1994-01-01
This article reexamines the notion of closed-loop carrier phase synchronization motivated by the theory of maximum a posteriori phase estimation with emphasis on the development of new structures based on both maximum-likelihood and average-likelihood functions. The criterion of performance used for comparison of all the closed-loop structures discussed is the mean-squared phase error for a fixed-loop bandwidth.
Software for Quantifying and Simulating Microsatellite Genotyping Error
Johnson, Paul C.D.; Haydon, Daniel T.
2007-01-01
Microsatellite genetic marker data are exploited in a variety of fields, including forensics, gene mapping, kinship inference and population genetics. In all of these fields, inference can be thwarted by failure to quantify and account for data errors, and kinship inference in particular can benefit from separating errors into two distinct classes: allelic dropout and false alleles. Pedant is MS Windows software for estimating locus-specific maximum likelihood rates of these two classes of error. Estimation is based on comparison of duplicate error-prone genotypes: neither reference genotypes nor pedigree data are required. Other functions include: plotting of error rate estimates and confidence intervals; simulations for performing power analysis and for testing the robustness of error rate estimates to violation of the underlying assumptions; and estimation of expected heterozygosity, which is a required input. The program, documentation and source code are available from http://www.stats.gla.ac.uk/~paulj/pedant.html. PMID:20066126
NASA Technical Reports Server (NTRS)
Klein, V.
1979-01-01
Two identification methods, the equation error method and the output error method, are used to estimate stability and control parameter values from flight data for a low-wing, single-engine, general aviation airplane. The estimated parameters from both methods are in very good agreement primarily because of sufficient accuracy of measured data. The estimated static parameters also agree with the results from steady flights. The effect of power different input forms are demonstrated. Examination of all results available gives the best values of estimated parameters and specifies their accuracies.
Application of parameter estimation to highly unstable aircraft
NASA Technical Reports Server (NTRS)
Maine, R. E.; Murray, J. E.
1986-01-01
This paper discusses the application of parameter estimation to highly unstable aircraft. It includes a discussion of the problems in applying the output error method to such aircraft and demonstrates that the filter error method eliminates these problems. The paper shows that the maximum likelihood estimator with no process noise does not reduce to the output error method when the system is unstable. It also proposes and demonstrates an ad hoc method that is similar in form to the filter error method, but applicable to nonlinear problems. Flight data from the X-29 forward-swept-wing demonstrator is used to illustrate the problems and methods discussed.
Application of parameter estimation to highly unstable aircraft
NASA Technical Reports Server (NTRS)
Maine, R. E.; Murray, J. E.
1986-01-01
The application of parameter estimation to highly unstable aircraft is discussed. Included are a discussion of the problems in applying the output error method to such aircraft and demonstrates that the filter error method eliminates these problems. The paper shows that the maximum likelihood estimator with no process noise does not reduce to the output error method when the system is unstable. It also proposes and demonstrates an ad hoc method that is similar in form to the filter error method, but applicable to nonlinear problems. Flight data from the X-29 forward-swept-wing demonstrator is used to illustrate the problems and methods discussed.
NASA Technical Reports Server (NTRS)
Pierson, W. J.
1982-01-01
The scatterometer on the National Oceanic Satellite System (NOSS) is studied by means of Monte Carlo techniques so as to determine the effect of two additional antennas for alias (or ambiguity) removal by means of an objective criteria technique and a normalized maximum likelihood estimator. Cells nominally 10 km by 10 km, 10 km by 50 km, and 50 km by 50 km are simulated for winds of 4, 8, 12 and 24 m/s and incidence angles of 29, 39, 47, and 53.5 deg for 15 deg changes in direction. The normalized maximum likelihood estimate (MLE) is correct a large part of the time, but the objective criterion technique is recommended as a reserve, and more quickly computed, procedure. Both methods for alias removal depend on the differences in the present model function at upwind and downwind. For 10 km by 10 km cells, it is found that the MLE method introduces a correlation between wind speed errors and aspect angle (wind direction) errors that can be as high as 0.8 or 0.9 and that the wind direction errors are unacceptably large, compared to those obtained for the SASS for similar assumptions.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Fast maximum likelihood estimation using continuous-time neural point process models.
Lepage, Kyle Q; MacDonald, Christopher J
2015-06-01
A recent report estimates that the number of simultaneously recorded neurons is growing exponentially. A commonly employed statistical paradigm using discrete-time point process models of neural activity involves the computation of a maximum-likelihood estimate. The time to computate this estimate, per neuron, is proportional to the number of bins in a finely spaced discretization of time. By using continuous-time models of neural activity and the optimally efficient Gaussian quadrature, memory requirements and computation times are dramatically decreased in the commonly encountered situation where the number of parameters p is much less than the number of time-bins n. In this regime, with q equal to the quadrature order, memory requirements are decreased from O(np) to O(qp), and the number of floating-point operations are decreased from O(np(2)) to O(qp(2)). Accuracy of the proposed estimates is assessed based upon physiological consideration, error bounds, and mathematical results describing the relation between numerical integration error and numerical error affecting both parameter estimates and the observed Fisher information. A check is provided which is used to adapt the order of numerical integration. The procedure is verified in simulation and for hippocampal recordings. It is found that in 95 % of hippocampal recordings a q of 60 yields numerical error negligible with respect to parameter estimate standard error. Statistical inference using the proposed methodology is a fast and convenient alternative to statistical inference performed using a discrete-time point process model of neural activity. It enables the employment of the statistical methodology available with discrete-time inference, but is faster, uses less memory, and avoids any error due to discretization.
Shariat, Mohammad Hassan; Gazor, Saeed; Redfearn, Damian
2016-08-01
In this paper, we study the problem of the cardiac conduction velocity (CCV) estimation for the sequential intracardiac mapping. We assume that the intracardiac electrograms of several cardiac sites are sequentially recorded, their activation times (ATs) are extracted, and the corresponding wavefronts are specified. The locations of the mapping catheter's electrodes and the ATs of the wavefronts are used here for the CCV estimation. We assume that the extracted ATs include some estimation errors, which we model with zero-mean white Gaussian noise values with known variances. Assuming stable planar wavefront propagation, we derive the maximum likelihood CCV estimator, when the synchronization times between various recording sites are unknown. We analytically evaluate the performance of the CCV estimator and provide its mean square estimation error. Our simulation results confirm the accuracy of the proposed method and the error analysis of the proposed CCV estimator.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.
Scheike, Thomas H; Juul, Anders
2004-04-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
Measurement Model Specification Error in LISREL Structural Equation Models.
ERIC Educational Resources Information Center
Baldwin, Beatrice; Lomax, Richard
This LISREL study examines the robustness of the maximum likelihood estimates under varying degrees of measurement model misspecification. A true model containing five latent variables (two endogenous and three exogenous) and two indicator variables per latent variable was used. Measurement model misspecification considered included errors of…
Identification of dynamic systems, theory and formulation
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1985-01-01
The problem of estimating parameters of dynamic systems is addressed in order to present the theoretical basis of system identification and parameter estimation in a manner that is complete and rigorous, yet understandable with minimal prerequisites. Maximum likelihood and related estimators are highlighted. The approach used requires familiarity with calculus, linear algebra, and probability, but does not require knowledge of stochastic processes or functional analysis. The treatment emphasizes unification of the various areas in estimation in dynamic systems is treated as a direct outgrowth of the static system theory. Topics covered include basic concepts and definitions; numerical optimization methods; probability; statistical estimators; estimation in static systems; stochastic processes; state estimation in dynamic systems; output error, filter error, and equation error methods of parameter estimation in dynamic systems, and the accuracy of the estimates.
Modeling gene expression measurement error: a quasi-likelihood approach
Strimmer, Korbinian
2003-01-01
Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. PMID:12659637
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
Maximum likelihood techniques applied to quasi-elastic light scattering
NASA Technical Reports Server (NTRS)
Edwards, Robert V.
1992-01-01
There is a necessity of having an automatic procedure for reliable estimation of the quality of the measurement of particle size from QELS (Quasi-Elastic Light Scattering). Getting the measurement itself, before any error estimates can be made, is a problem because it is obtained by a very indirect measurement of a signal derived from the motion of particles in the system and requires the solution of an inverse problem. The eigenvalue structure of the transform that generates the signal is such that an arbitrarily small amount of noise can obliterate parts of any practical inversion spectrum. This project uses the Maximum Likelihood Estimation (MLE) as a framework to generate a theory and a functioning set of software to oversee the measurement process and extract the particle size information, while at the same time providing error estimates for those measurements. The theory involved verifying a correct form of the covariance matrix for the noise on the measurement and then estimating particle size parameters using a modified histogram approach.
A Bayesian approach to parameter and reliability estimation in the Poisson distribution.
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1972-01-01
For life testing procedures, a Bayesian analysis is developed with respect to a random intensity parameter in the Poisson distribution. Bayes estimators are derived for the Poisson parameter and the reliability function based on uniform and gamma prior distributions of that parameter. A Monte Carlo procedure is implemented to make possible an empirical mean-squared error comparison between Bayes and existing minimum variance unbiased, as well as maximum likelihood, estimators. As expected, the Bayes estimators have mean-squared errors that are appreciably smaller than those of the other two.
Determinants of Standard Errors of MLEs in Confirmatory Factor Analysis
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Cheng, Ying; Zhang, Wei
2010-01-01
This paper studies changes of standard errors (SE) of the normal-distribution-based maximum likelihood estimates (MLE) for confirmatory factor models as model parameters vary. Using logical analysis, simplified formulas and numerical verification, monotonic relationships between SEs and factor loadings as well as unique variances are found.…
NASA Technical Reports Server (NTRS)
Rodriguez, G.; Scheid, R. E., Jr.
1986-01-01
This paper outlines methods for modeling, identification and estimation for static determination of flexible structures. The shape estimation schemes are based on structural models specified by (possibly interconnected) elliptic partial differential equations. The identification techniques provide approximate knowledge of parameters in elliptic systems. The techniques are based on the method of maximum-likelihood that finds parameter values such that the likelihood functional associated with the system model is maximized. The estimation methods are obtained by means of a function-space approach that seeks to obtain the conditional mean of the state given the data and a white noise characterization of model errors. The solutions are obtained in a batch-processing mode in which all the data is processed simultaneously. After methods for computing the optimal estimates are developed, an analysis of the second-order statistics of the estimates and of the related estimation error is conducted. In addition to outlining the above theoretical results, the paper presents typical flexible structure simulations illustrating performance of the shape determination methods.
Parameter Estimation for Thurstone Choice Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vojnovic, Milan; Yun, Seyoung
We consider the estimation accuracy of individual strength parameters of a Thurstone choice model when each input observation consists of a choice of one item from a set of two or more items (so called top-1 lists). This model accommodates the well-known choice models such as the Luce choice model for comparison sets of two or more items and the Bradley-Terry model for pair comparisons. We provide a tight characterization of the mean squared error of the maximum likelihood parameter estimator. We also provide similar characterizations for parameter estimators defined by a rank-breaking method, which amounts to deducing one ormore » more pair comparisons from a comparison of two or more items, assuming independence of these pair comparisons, and maximizing a likelihood function derived under these assumptions. We also consider a related binary classification problem where each individual parameter takes value from a set of two possible values and the goal is to correctly classify all items within a prescribed classification error. The results of this paper shed light on how the parameter estimation accuracy depends on given Thurstone choice model and the structure of comparison sets. In particular, we found that for unbiased input comparison sets of a given cardinality, when in expectation each comparison set of given cardinality occurs the same number of times, for a broad class of Thurstone choice models, the mean squared error decreases with the cardinality of comparison sets, but only marginally according to a diminishing returns relation. On the other hand, we found that there exist Thurstone choice models for which the mean squared error of the maximum likelihood parameter estimator can decrease much faster with the cardinality of comparison sets. We report empirical evaluation of some claims and key parameters revealed by theory using both synthetic and real-world input data from some popular sport competitions and online labor platforms.« less
An empirical Bayes approach for the Poisson life distribution.
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1973-01-01
A smooth empirical Bayes estimator is derived for the intensity parameter (hazard rate) in the Poisson distribution as used in life testing. The reliability function is also estimated either by using the empirical Bayes estimate of the parameter, or by obtaining the expectation of the reliability function. The behavior of the empirical Bayes procedure is studied through Monte Carlo simulation in which estimates of mean-squared errors of the empirical Bayes estimators are compared with those of conventional estimators such as minimum variance unbiased or maximum likelihood. Results indicate a significant reduction in mean-squared error of the empirical Bayes estimators over the conventional variety.
Learn-as-you-go acceleration of cosmological parameter estimates
NASA Astrophysics Data System (ADS)
Aslanyan, Grigor; Easther, Richard; Price, Layne C.
2015-09-01
Cosmological analyses can be accelerated by approximating slow calculations using a training set, which is either precomputed or generated dynamically. However, this approach is only safe if the approximations are well understood and controlled. This paper surveys issues associated with the use of machine-learning based emulation strategies for accelerating cosmological parameter estimation. We describe a learn-as-you-go algorithm that is implemented in the Cosmo++ code and (1) trains the emulator while simultaneously estimating posterior probabilities; (2) identifies unreliable estimates, computing the exact numerical likelihoods if necessary; and (3) progressively learns and updates the error model as the calculation progresses. We explicitly describe and model the emulation error and show how this can be propagated into the posterior probabilities. We apply these techniques to the Planck likelihood and the calculation of ΛCDM posterior probabilities. The computation is significantly accelerated without a pre-defined training set and uncertainties in the posterior probabilities are subdominant to statistical fluctuations. We have obtained a speedup factor of 6.5 for Metropolis-Hastings and 3.5 for nested sampling. Finally, we discuss the general requirements for a credible error model and show how to update them on-the-fly.
Learn-as-you-go acceleration of cosmological parameter estimates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aslanyan, Grigor; Easther, Richard; Price, Layne C., E-mail: g.aslanyan@auckland.ac.nz, E-mail: r.easther@auckland.ac.nz, E-mail: lpri691@aucklanduni.ac.nz
2015-09-01
Cosmological analyses can be accelerated by approximating slow calculations using a training set, which is either precomputed or generated dynamically. However, this approach is only safe if the approximations are well understood and controlled. This paper surveys issues associated with the use of machine-learning based emulation strategies for accelerating cosmological parameter estimation. We describe a learn-as-you-go algorithm that is implemented in the Cosmo++ code and (1) trains the emulator while simultaneously estimating posterior probabilities; (2) identifies unreliable estimates, computing the exact numerical likelihoods if necessary; and (3) progressively learns and updates the error model as the calculation progresses. We explicitlymore » describe and model the emulation error and show how this can be propagated into the posterior probabilities. We apply these techniques to the Planck likelihood and the calculation of ΛCDM posterior probabilities. The computation is significantly accelerated without a pre-defined training set and uncertainties in the posterior probabilities are subdominant to statistical fluctuations. We have obtained a speedup factor of 6.5 for Metropolis-Hastings and 3.5 for nested sampling. Finally, we discuss the general requirements for a credible error model and show how to update them on-the-fly.« less
Correcting for sequencing error in maximum likelihood phylogeny inference.
Kuhner, Mary K; McGill, James
2014-11-04
Accurate phylogenies are critical to taxonomy as well as studies of speciation processes and other evolutionary patterns. Accurate branch lengths in phylogenies are critical for dating and rate measurements. Such accuracy may be jeopardized by unacknowledged sequencing error. We use simulated data to test a correction for DNA sequencing error in maximum likelihood phylogeny inference. Over a wide range of data polymorphism and true error rate, we found that correcting for sequencing error improves recovery of the branch lengths, even if the assumed error rate is up to twice the true error rate. Low error rates have little effect on recovery of the topology. When error is high, correction improves topological inference; however, when error is extremely high, using an assumed error rate greater than the true error rate leads to poor recovery of both topology and branch lengths. The error correction approach tested here was proposed in 2004 but has not been widely used, perhaps because researchers do not want to commit to an estimate of the error rate. This study shows that correction with an approximate error rate is generally preferable to ignoring the issue. Copyright © 2014 Kuhner and McGill.
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
The Sensitivity of Adverse Event Cost Estimates to Diagnostic Coding Error
Wardle, Gavin; Wodchis, Walter P; Laporte, Audrey; Anderson, Geoffrey M; Baker, Ross G
2012-01-01
Objective To examine the impact of diagnostic coding error on estimates of hospital costs attributable to adverse events. Data Sources Original and reabstracted medical records of 9,670 complex medical and surgical admissions at 11 hospital corporations in Ontario from 2002 to 2004. Patient specific costs, not including physician payments, were retrieved from the Ontario Case Costing Initiative database. Study Design Adverse events were identified among the original and reabstracted records using ICD10-CA (Canadian adaptation of ICD10) codes flagged as postadmission complications. Propensity score matching and multivariate regression analysis were used to estimate the cost of the adverse events and to determine the sensitivity of cost estimates to diagnostic coding error. Principal Findings Estimates of the cost of the adverse events ranged from $16,008 (metabolic derangement) to $30,176 (upper gastrointestinal bleeding). Coding errors caused the total cost attributable to the adverse events to be underestimated by 16 percent. The impact of coding error on adverse event cost estimates was highly variable at the organizational level. Conclusions Estimates of adverse event costs are highly sensitive to coding error. Adverse event costs may be significantly underestimated if the likelihood of error is ignored. PMID:22091908
Patrick L. Zimmerman; Greg C. Liknes
2010-01-01
Dot grids are often used to estimate the proportion of land cover belonging to some class in an aerial photograph. Interpreter misclassification is an often-ignored source of error in dot-grid sampling that has the potential to significantly bias proportion estimates. For the case when the true class of items is unknown, we present a maximum-likelihood estimator of...
NASA Technical Reports Server (NTRS)
Suit, W. T.; Cannaday, R. L.
1979-01-01
The longitudinal and lateral stability and control parameters for a high wing, general aviation, airplane are examined. Estimations using flight data obtained at various flight conditions within the normal range of the aircraft are presented. The estimations techniques, an output error technique (maximum likelihood) and an equation error technique (linear regression), are presented. The longitudinal static parameters are estimated from climbing, descending, and quasi steady state flight data. The lateral excitations involve a combination of rudder and ailerons. The sensitivity of the aircraft modes of motion to variations in the parameter estimates are discussed.
NASA Astrophysics Data System (ADS)
Dilla, Shintia Ulfa; Andriyana, Yudhie; Sudartianto
2017-03-01
Acid rain causes many bad effects in life. It is formed by two strong acids, sulfuric acid (H2SO4) and nitric acid (HNO3), where sulfuric acid is derived from SO2 and nitric acid from NOx {x=1,2}. The purpose of the research is to find out the influence of So4 and NO3 levels contained in the rain to the acidity (pH) of rainwater. The data are incomplete panel data with two-way error component model. The panel data is a collection of some of the observations that observed from time to time. It is said incomplete if each individual has a different amount of observation. The model used in this research is in the form of random effects model (REM). Minimum variance quadratic unbiased estimation (MIVQUE) is used to estimate the variance error components, while maximum likelihood estimation is used to estimate the parameters. As a result, we obtain the following model: Ŷ* = 0.41276446 - 0.00107302X1 + 0.00215470X2.
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Effects of Employing Ridge Regression in Structural Equation Models.
ERIC Educational Resources Information Center
McQuitty, Shaun
1997-01-01
LISREL 8 invokes a ridge option when maximum likelihood or generalized least squares are used to estimate a structural equation model with a nonpositive definite covariance or correlation matrix. Implications of the ridge option for model fit, parameter estimates, and standard errors are explored through two examples. (SLD)
On modeling animal movements using Brownian motion with measurement error.
Pozdnyakov, Vladimir; Meyer, Thomas; Wang, Yu-Bo; Yan, Jun
2014-02-01
Modeling animal movements with Brownian motion (or more generally by a Gaussian process) has a long tradition in ecological studies. The recent Brownian bridge movement model (BBMM), which incorporates measurement errors, has been quickly adopted by ecologists because of its simplicity and tractability. We discuss some nontrivial properties of the discrete-time stochastic process that results from observing a Brownian motion with added normal noise at discrete times. In particular, we demonstrate that the observed sequence of random variables is not Markov. Consequently the expected occupation time between two successively observed locations does not depend on just those two observations; the whole path must be taken into account. Nonetheless, the exact likelihood function of the observed time series remains tractable; it requires only sparse matrix computations. The likelihood-based estimation procedure is described in detail and compared to the BBMM estimation.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
NASA Astrophysics Data System (ADS)
Fenicia, Fabrizio; Reichert, Peter; Kavetski, Dmitri; Albert, Calro
2016-04-01
The calibration of hydrological models based on signatures (e.g. Flow Duration Curves - FDCs) is often advocated as an alternative to model calibration based on the full time series of system responses (e.g. hydrographs). Signature based calibration is motivated by various arguments. From a conceptual perspective, calibration on signatures is a way to filter out errors that are difficult to represent when calibrating on the full time series. Such errors may for example occur when observed and simulated hydrographs are shifted, either on the "time" axis (i.e. left or right), or on the "streamflow" axis (i.e. above or below). These shifts may be due to errors in the precipitation input (time or amount), and if not properly accounted in the likelihood function, may cause biased parameter estimates (e.g. estimated model parameters that do not reproduce the recession characteristics of a hydrograph). From a practical perspective, signature based calibration is seen as a possible solution for making predictions in ungauged basins. Where streamflow data are not available, it may in fact be possible to reliably estimate streamflow signatures. Previous research has for example shown how FDCs can be reliably estimated at ungauged locations based on climatic and physiographic influence factors. Typically, the goal of signature based calibration is not the prediction of the signatures themselves, but the prediction of the system responses. Ideally, the prediction of system responses should be accompanied by a reliable quantification of the associated uncertainties. Previous approaches for signature based calibration, however, do not allow reliable estimates of streamflow predictive distributions. Here, we illustrate how the Bayesian approach can be employed to obtain reliable streamflow predictive distributions based on signatures. A case study is presented, where a hydrological model is calibrated on FDCs and additional signatures. We propose an approach where the likelihood function for the signatures is derived from the likelihood for streamflow (rather than using an "ad-hoc" likelihood for the signatures as done in previous approaches). This likelihood is not easily tractable analytically and we therefore cannot apply "simple" MCMC methods. This numerical problem is solved using Approximate Bayesian Computation (ABC). Our result indicate that the proposed approach is suitable for producing reliable streamflow predictive distributions based on calibration to signature data. Moreover, our results provide indications on which signatures are more appropriate to represent the information content of the hydrograph.
NASA Technical Reports Server (NTRS)
Pierson, W. J., Jr.
1984-01-01
Backscatter measurements at upwind and crosswind are simulated for five incidence angles by means of the SASS-1 model function. The effects of communication noise and attitude errors are simulated by Monte Carlo methods, and the winds are recovered by both the Sum of Square (SOS) algorithm and a Maximum Likelihood Estimater (MLE). The SOS algorithm is shown to fail for light enough winds at all incidence angles and to fail to show areas of calm because backscatter estimates that were negative or that produced incorrect values of K sub p greater than one were discarded. The MLE performs well for all input backscatter estimates and returns calm when both are negative. The use of the SOS algorithm is shown to have introduced errors in the SASS-1 model function that, in part, cancel out the errors that result from using it, but that also cause disagreement with other data sources such as the AAFE circle flight data at light winds. Implications for future scatterometer systems are given.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.
Gil, Manuel
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263
Motion-induced phase error estimation and correction in 3D diffusion tensor imaging.
Van, Anh T; Hernando, Diego; Sutton, Bradley P
2011-11-01
A multishot data acquisition strategy is one way to mitigate B0 distortion and T2∗ blurring for high-resolution diffusion-weighted magnetic resonance imaging experiments. However, different object motions that take place during different shots cause phase inconsistencies in the data, leading to significant image artifacts. This work proposes a maximum likelihood estimation and k-space correction of motion-induced phase errors in 3D multishot diffusion tensor imaging. The proposed error estimation is robust, unbiased, and approaches the Cramer-Rao lower bound. For rigid body motion, the proposed correction effectively removes motion-induced phase errors regardless of the k-space trajectory used and gives comparable performance to the more computationally expensive 3D iterative nonlinear phase error correction method. The method has been extended to handle multichannel data collected using phased-array coils. Simulation and in vivo data are shown to demonstrate the performance of the method.
Generalized Ordinary Differential Equation Models 1
Miao, Hongyu; Wu, Hulin; Xue, Hongqi
2014-01-01
Existing estimation methods for ordinary differential equation (ODE) models are not applicable to discrete data. The generalized ODE (GODE) model is therefore proposed and investigated for the first time. We develop the likelihood-based parameter estimation and inference methods for GODE models. We propose robust computing algorithms and rigorously investigate the asymptotic properties of the proposed estimator by considering both measurement errors and numerical errors in solving ODEs. The simulation study and application of our methods to an influenza viral dynamics study suggest that the proposed methods have a superior performance in terms of accuracy over the existing ODE model estimation approach and the extended smoothing-based (ESB) method. PMID:25544787
Generalized Ordinary Differential Equation Models.
Miao, Hongyu; Wu, Hulin; Xue, Hongqi
2014-10-01
Existing estimation methods for ordinary differential equation (ODE) models are not applicable to discrete data. The generalized ODE (GODE) model is therefore proposed and investigated for the first time. We develop the likelihood-based parameter estimation and inference methods for GODE models. We propose robust computing algorithms and rigorously investigate the asymptotic properties of the proposed estimator by considering both measurement errors and numerical errors in solving ODEs. The simulation study and application of our methods to an influenza viral dynamics study suggest that the proposed methods have a superior performance in terms of accuracy over the existing ODE model estimation approach and the extended smoothing-based (ESB) method.
The Role of Parametric Assumptions in Adaptive Bayesian Estimation
ERIC Educational Resources Information Center
Alcala-Quintana, Rocio; Garcia-Perez, Miguel A.
2004-01-01
Variants of adaptive Bayesian procedures for estimating the 5% point on a psychometric function were studied by simulation. Bias and standard error were the criteria to evaluate performance. The results indicated a superiority of (a) uniform priors, (b) model likelihood functions that are odd symmetric about threshold and that have parameter…
An introduction of component fusion extend Kalman filtering method
NASA Astrophysics Data System (ADS)
Geng, Yue; Lei, Xusheng
2018-05-01
In this paper, the Component Fusion Extend Kalman Filtering (CFEKF) algorithm is proposed. Assuming each component of error propagation are independent with Gaussian distribution. The CFEKF can be obtained through the maximum likelihood of propagation error, which can adjust the state transition matrix and the measured matrix adaptively. With minimize linearization error, CFEKF can an effectively improve the estimation accuracy of nonlinear system state. The computation of CFEKF is similar to EKF which is easy for application.
Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood
Bondell, Howard D.; Stefanski, Leonard A.
2013-01-01
Large- and finite-sample efficiency and resistance to outliers are the key goals of robust statistics. Although often not simultaneously attainable, we develop and study a linear regression estimator that comes close. Efficiency obtains from the estimator’s close connection to generalized empirical likelihood, and its favorable robustness properties are obtained by constraining the associated sum of (weighted) squared residuals. We prove maximum attainable finite-sample replacement breakdown point, and full asymptotic efficiency for normal errors. Simulation evidence shows that compared to existing robust regression estimators, the new estimator has relatively high efficiency for small sample sizes, and comparable outlier resistance. The estimator is further illustrated and compared to existing methods via application to a real data set with purported outliers. PMID:23976805
Khairuzzaman, Md; Zhang, Chao; Igarashi, Koji; Katoh, Kazuhiro; Kikuchi, Kazuro
2010-03-01
We describe a successful introduction of maximum-likelihood-sequence estimation (MLSE) into digital coherent receivers together with finite-impulse response (FIR) filters in order to equalize both linear and nonlinear fiber impairments. The MLSE equalizer based on the Viterbi algorithm is implemented in the offline digital signal processing (DSP) core. We transmit 20-Gbit/s quadrature phase-shift keying (QPSK) signals through a 200-km-long standard single-mode fiber. The bit-error rate performance shows that the MLSE equalizer outperforms the conventional adaptive FIR filter, especially when nonlinear impairments are predominant.
Competition between learned reward and error outcome predictions in anterior cingulate cortex.
Alexander, William H; Brown, Joshua W
2010-02-15
The anterior cingulate cortex (ACC) is implicated in performance monitoring and cognitive control. Non-human primate studies of ACC show prominent reward signals, but these are elusive in human studies, which instead show mainly conflict and error effects. Here we demonstrate distinct appetitive and aversive activity in human ACC. The error likelihood hypothesis suggests that ACC activity increases in proportion to the likelihood of an error, and ACC is also sensitive to the consequence magnitude of the predicted error. Previous work further showed that error likelihood effects reach a ceiling as the potential consequences of an error increase, possibly due to reductions in the average reward. We explored this issue by independently manipulating reward magnitude of task responses and error likelihood while controlling for potential error consequences in an Incentive Change Signal Task. The fMRI results ruled out a modulatory effect of expected reward on error likelihood effects in favor of a competition effect between expected reward and error likelihood. Dynamic causal modeling showed that error likelihood and expected reward signals are intrinsic to the ACC rather than received from elsewhere. These findings agree with interpretations of ACC activity as signaling both perceptions of risk and predicted reward. Copyright 2009 Elsevier Inc. All rights reserved.
Risk prediction and aversion by anterior cingulate cortex.
Brown, Joshua W; Braver, Todd S
2007-12-01
The recently proposed error-likelihood hypothesis suggests that anterior cingulate cortex (ACC) and surrounding areas will become active in proportion to the perceived likelihood of an error. The hypothesis was originally derived from a computational model prediction. The same computational model now makes a further prediction that ACC will be sensitive not only to predicted error likelihood, but also to the predicted magnitude of the consequences, should an error occur. The product of error likelihood and predicted error consequence magnitude collectively defines the general "expected risk" of a given behavior in a manner analogous but orthogonal to subjective expected utility theory. New fMRI results from an incentivechange signal task now replicate the error-likelihood effect, validate the further predictions of the computational model, and suggest why some segments of the population may fail to show an error-likelihood effect. In particular, error-likelihood effects and expected risk effects in general indicate greater sensitivity to earlier predictors of errors and are seen in risk-averse but not risk-tolerant individuals. Taken together, the results are consistent with an expected risk model of ACC and suggest that ACC may generally contribute to cognitive control by recruiting brain activity to avoid risk.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Generalized site occupancy models allowing for false positive and false negative errors
Royle, J. Andrew; Link, W.A.
2006-01-01
Site occupancy models have been developed that allow for imperfect species detection or ?false negative? observations. Such models have become widely adopted in surveys of many taxa. The most fundamental assumption underlying these models is that ?false positive? errors are not possible. That is, one cannot detect a species where it does not occur. However, such errors are possible in many sampling situations for a number of reasons, and even low false positive error rates can induce extreme bias in estimates of site occupancy when they are not accounted for. In this paper, we develop a model for site occupancy that allows for both false negative and false positive error rates. This model can be represented as a two-component finite mixture model and can be easily fitted using freely available software. We provide an analysis of avian survey data using the proposed model and present results of a brief simulation study evaluating the performance of the maximum-likelihood estimator and the naive estimator in the presence of false positive errors.
Kamneva, Olga K; Rosenberg, Noah A
2017-01-01
Hybridization events generate reticulate species relationships, giving rise to species networks rather than species trees. We report a comparative study of consensus, maximum parsimony, and maximum likelihood methods of species network reconstruction using gene trees simulated assuming a known species history. We evaluate the role of the divergence time between species involved in a hybridization event, the relative contributions of the hybridizing species, and the error in gene tree estimation. When gene tree discordance is mostly due to hybridization and not due to incomplete lineage sorting (ILS), most of the methods can detect even highly skewed hybridization events between highly divergent species. For recent divergences between hybridizing species, when the influence of ILS is sufficiently high, likelihood methods outperform parsimony and consensus methods, which erroneously identify extra hybridizations. The more sophisticated likelihood methods, however, are affected by gene tree errors to a greater extent than are consensus and parsimony. PMID:28469378
A Review of System Identification Methods Applied to Aircraft
NASA Technical Reports Server (NTRS)
Klein, V.
1983-01-01
Airplane identification, equation error method, maximum likelihood method, parameter estimation in frequency domain, extended Kalman filter, aircraft equations of motion, aerodynamic model equations, criteria for the selection of a parsimonious model, and online aircraft identification are addressed.
Fisher, Sir Ronald Aylmer (1890-1962)
NASA Astrophysics Data System (ADS)
Murdin, P.
2000-11-01
Statistician, born in London, England. After studying astronomy using AIRY's manual on the Theory of Errors he became interested in statistics, and laid the foundation of randomization in experimental design, the analysis of variance and the use of data in estimating the properties of the parent population from which it was drawn. Invented the maximum likelihood method for estimating from random ...
NASA Technical Reports Server (NTRS)
Tranter, W. H.; Turner, M. D.
1977-01-01
Techniques are developed to estimate power gain, delay, signal-to-noise ratio, and mean square error in digital computer simulations of lowpass and bandpass systems. The techniques are applied to analog and digital communications. The signal-to-noise ratio estimates are shown to be maximum likelihood estimates in additive white Gaussian noise. The methods are seen to be especially useful for digital communication systems where the mapping from the signal-to-noise ratio to the error probability can be obtained. Simulation results show the techniques developed to be accurate and quite versatile in evaluating the performance of many systems through digital computer simulation.
Estimation of the sea surface's two-scale backscatter parameters
NASA Technical Reports Server (NTRS)
Wentz, F. J.
1978-01-01
The relationship between the sea-surface normalized radar cross section and the friction velocity vector is determined using a parametric two-scale scattering model. The model parameters are found from a nonlinear maximum likelihood estimation. The estimation is based on aircraft scatterometer measurements and the sea-surface anemometer measurements collected during the JONSWAP '75 experiment. The estimates of the ten model parameters converge to realistic values that are in good agreement with the available oceanographic data. The rms discrepancy between the model and the cross section measurements is 0.7 db, which is the rms sum of a 0.3 db average measurement error and a 0.6 db modeling error.
Statistical methods for biodosimetry in the presence of both Berkson and classical measurement error
NASA Astrophysics Data System (ADS)
Miller, Austin
In radiation epidemiology, the true dose received by those exposed cannot be assessed directly. Physical dosimetry uses a deterministic function of the source term, distance and shielding to estimate dose. For the atomic bomb survivors, the physical dosimetry system is well established. The classical measurement errors plaguing the location and shielding inputs to the physical dosimetry system are well known. Adjusting for the associated biases requires an estimate for the classical measurement error variance, for which no data-driven estimate exists. In this case, an instrumental variable solution is the most viable option to overcome the classical measurement error indeterminacy. Biological indicators of dose may serve as instrumental variables. Specification of the biodosimeter dose-response model requires identification of the radiosensitivity variables, for which we develop statistical definitions and variables. More recently, researchers have recognized Berkson error in the dose estimates, introduced by averaging assumptions for many components in the physical dosimetry system. We show that Berkson error induces a bias in the instrumental variable estimate of the dose-response coefficient, and then address the estimation problem. This model is specified by developing an instrumental variable mixed measurement error likelihood function, which is then maximized using a Monte Carlo EM Algorithm. These methods produce dose estimates that incorporate information from both physical and biological indicators of dose, as well as the first instrumental variable based data-driven estimate for the classical measurement error variance.
Estimating residual fault hitting rates by recapture sampling
NASA Technical Reports Server (NTRS)
Lee, Larry; Gupta, Rajan
1988-01-01
For the recapture debugging design introduced by Nayak (1988) the problem of estimating the hitting rates of the faults remaining in the system is considered. In the context of a conditional likelihood, moment estimators are derived and are shown to be asymptotically normal and fully efficient. Fixed sample properties of the moment estimators are compared, through simulation, with those of the conditional maximum likelihood estimators. Properties of the conditional model are investigated such as the asymptotic distribution of linear functions of the fault hitting frequencies and a representation of the full data vector in terms of a sequence of independent random vectors. It is assumed that the residual hitting rates follow a log linear rate model and that the testing process is truncated when the gaps between the detection of new errors exceed a fixed amount of time.
Improved efficiency of maximum likelihood analysis of time series with temporally correlated errors
Langbein, John O.
2017-01-01
Most time series of geophysical phenomena have temporally correlated errors. From these measurements, various parameters are estimated. For instance, from geodetic measurements of positions, the rates and changes in rates are often estimated and are used to model tectonic processes. Along with the estimates of the size of the parameters, the error in these parameters needs to be assessed. If temporal correlations are not taken into account, or each observation is assumed to be independent, it is likely that any estimate of the error of these parameters will be too low and the estimated value of the parameter will be biased. Inclusion of better estimates of uncertainties is limited by several factors, including selection of the correct model for the background noise and the computational requirements to estimate the parameters of the selected noise model for cases where there are numerous observations. Here, I address the second problem of computational efficiency using maximum likelihood estimates (MLE). Most geophysical time series have background noise processes that can be represented as a combination of white and power-law noise, 1/fα">1/fα1/fα with frequency, f. With missing data, standard spectral techniques involving FFTs are not appropriate. Instead, time domain techniques involving construction and inversion of large data covariance matrices are employed. Bos et al. (J Geod, 2013. doi:10.1007/s00190-012-0605-0) demonstrate one technique that substantially increases the efficiency of the MLE methods, yet is only an approximate solution for power-law indices >1.0 since they require the data covariance matrix to be Toeplitz. That restriction can be removed by simply forming a data filter that adds noise processes rather than combining them in quadrature. Consequently, the inversion of the data covariance matrix is simplified yet provides robust results for a wider range of power-law indices.
Improved efficiency of maximum likelihood analysis of time series with temporally correlated errors
NASA Astrophysics Data System (ADS)
Langbein, John
2017-08-01
Most time series of geophysical phenomena have temporally correlated errors. From these measurements, various parameters are estimated. For instance, from geodetic measurements of positions, the rates and changes in rates are often estimated and are used to model tectonic processes. Along with the estimates of the size of the parameters, the error in these parameters needs to be assessed. If temporal correlations are not taken into account, or each observation is assumed to be independent, it is likely that any estimate of the error of these parameters will be too low and the estimated value of the parameter will be biased. Inclusion of better estimates of uncertainties is limited by several factors, including selection of the correct model for the background noise and the computational requirements to estimate the parameters of the selected noise model for cases where there are numerous observations. Here, I address the second problem of computational efficiency using maximum likelihood estimates (MLE). Most geophysical time series have background noise processes that can be represented as a combination of white and power-law noise, 1/f^{α } with frequency, f. With missing data, standard spectral techniques involving FFTs are not appropriate. Instead, time domain techniques involving construction and inversion of large data covariance matrices are employed. Bos et al. (J Geod, 2013. doi: 10.1007/s00190-012-0605-0) demonstrate one technique that substantially increases the efficiency of the MLE methods, yet is only an approximate solution for power-law indices >1.0 since they require the data covariance matrix to be Toeplitz. That restriction can be removed by simply forming a data filter that adds noise processes rather than combining them in quadrature. Consequently, the inversion of the data covariance matrix is simplified yet provides robust results for a wider range of power-law indices.
Estimation in a discrete tail rate family of recapture sampling models
NASA Technical Reports Server (NTRS)
Gupta, Rajan; Lee, Larry D.
1990-01-01
In the context of recapture sampling design for debugging experiments the problem of estimating the error or hitting rate of the faults remaining in a system is considered. Moment estimators are derived for a family of models in which the rate parameters are assumed proportional to the tail probabilities of a discrete distribution on the positive integers. The estimators are shown to be asymptotically normal and fully efficient. Their fixed sample properties are compared, through simulation, with those of the conditional maximum likelihood estimators.
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Olsson, Ulf Henning; Foss, Tron; Troye, Sigurd V.; Howell, Roy D.
2000-01-01
Used simulation to demonstrate how the choice of estimation method affects indexes of fit and parameter bias for different sample sizes when nested models vary in terms of specification error and the data demonstrate different levels of kurtosis. Discusses results for maximum likelihood (ML), generalized least squares (GLS), and weighted least…
Zee, Jarcy; Xie, Sharon X.
2015-01-01
Summary When a true survival endpoint cannot be assessed for some subjects, an alternative endpoint that measures the true endpoint with error may be collected, which often occurs when obtaining the true endpoint is too invasive or costly. We develop an estimated likelihood function for the situation where we have both uncertain endpoints for all participants and true endpoints for only a subset of participants. We propose a nonparametric maximum estimated likelihood estimator of the discrete survival function of time to the true endpoint. We show that the proposed estimator is consistent and asymptotically normal. We demonstrate through extensive simulations that the proposed estimator has little bias compared to the naïve Kaplan-Meier survival function estimator, which uses only uncertain endpoints, and more efficient with moderate missingness compared to the complete-case Kaplan-Meier survival function estimator, which uses only available true endpoints. Finally, we apply the proposed method to a dataset for estimating the risk of developing Alzheimer's disease from the Alzheimer's Disease Neuroimaging Initiative. PMID:25916510
Mixed effects versus fixed effects modelling of binary data with inter-subject variability.
Murphy, Valda; Dunne, Adrian
2005-04-01
The question of whether or not a mixed effects model is required when modelling binary data with inter-subject variability and within subject correlation was reported in this journal by Yano et al. (J. Pharmacokin. Pharmacodyn. 28:389-412 [2001]). That report used simulation experiments to demonstrate that, under certain circumstances, the use of a fixed effects model produced more accurate estimates of the fixed effect parameters than those produced by a mixed effects model. The Laplace approximation to the likelihood was used when fitting the mixed effects model. This paper repeats one of those simulation experiments, with two binary observations recorded for every subject, and uses both the Laplace and the adaptive Gaussian quadrature approximations to the likelihood when fitting the mixed effects model. The results show that the estimates produced using the Laplace approximation include a small number of extreme outliers. This was not the case when using the adaptive Gaussian quadrature approximation. Further examination of these outliers shows that they arise in situations in which the Laplace approximation seriously overestimates the likelihood in an extreme region of the parameter space. It is also demonstrated that when the number of observations per subject is increased from two to three, the estimates based on the Laplace approximation no longer include any extreme outliers. The root mean squared error is a combination of the bias and the variability of the estimates. Increasing the sample size is known to reduce the variability of an estimator with a consequent reduction in its root mean squared error. The estimates based on the fixed effects model are inherently biased and this bias acts as a lower bound for the root mean squared error of these estimates. Consequently, it might be expected that for data sets with a greater number of subjects the estimates based on the mixed effects model would be more accurate than those based on the fixed effects model. This is borne out by the results of a further simulation experiment with an increased number of subjects in each set of data. The difference in the interpretation of the parameters of the fixed and mixed effects models is discussed. It is demonstrated that the mixed effects model and parameter estimates can be used to estimate the parameters of the fixed effects model but not vice versa.
Robust and efficient estimation with weighted composite quantile regression
NASA Astrophysics Data System (ADS)
Jiang, Xuejun; Li, Jingzhi; Xia, Tian; Yan, Wanfeng
2016-09-01
In this paper we introduce a weighted composite quantile regression (CQR) estimation approach and study its application in nonlinear models such as exponential models and ARCH-type models. The weighted CQR is augmented by using a data-driven weighting scheme. With the error distribution unspecified, the proposed estimators share robustness from quantile regression and achieve nearly the same efficiency as the oracle maximum likelihood estimator (MLE) for a variety of error distributions including the normal, mixed-normal, Student's t, Cauchy distributions, etc. We also suggest an algorithm for the fast implementation of the proposed methodology. Simulations are carried out to compare the performance of different estimators, and the proposed approach is used to analyze the daily S&P 500 Composite index, which verifies the effectiveness and efficiency of our theoretical results.
Phase History Decomposition for efficient Scatterer Classification in SAR Imagery
2011-09-15
frequency. Professor Rick Martin provided key advice on frequency parameter estimation and the relationship between likelihood ratio testing and the least...132 6.1.1 Imaging Error Due to Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 135 6.2 Subwindow Design and Weighting... test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 MF matched filter
Anticipating cognitive effort: roles of perceived error-likelihood and time demands.
Dunn, Timothy L; Inzlicht, Michael; Risko, Evan F
2017-11-13
Why are some actions evaluated as effortful? In the present set of experiments we address this question by examining individuals' perception of effort when faced with a trade-off between two putative cognitive costs: how much time a task takes vs. how error-prone it is. Specifically, we were interested in whether individuals anticipate engaging in a small amount of hard work (i.e., low time requirement, but high error-likelihood) vs. a large amount of easy work (i.e., high time requirement, but low error-likelihood) as being more effortful. In between-subject designs, Experiments 1 through 3 demonstrated that individuals anticipate options that are high in perceived error-likelihood (yet less time consuming) as more effortful than options that are perceived to be more time consuming (yet low in error-likelihood). Further, when asked to evaluate which of the two tasks was (a) more effortful, (b) more error-prone, and (c) more time consuming, effort-based and error-based choices closely tracked one another, but this was not the case for time-based choices. Utilizing a within-subject design, Experiment 4 demonstrated overall similar pattern of judgments as Experiments 1 through 3. However, both judgments of error-likelihood and time demand similarly predicted effort judgments. Results are discussed within the context of extant accounts of cognitive control, with considerations of how error-likelihood and time demands may independently and conjunctively factor into judgments of cognitive effort.
Zou, W; Ouyang, H
2016-02-01
We propose a multiple estimation adjustment (MEA) method to correct effect overestimation due to selection bias from a hypothesis-generating study (HGS) in pharmacogenetics. MEA uses a hierarchical Bayesian approach to model individual effect estimates from maximal likelihood estimation (MLE) in a region jointly and shrinks them toward the regional effect. Unlike many methods that model a fixed selection scheme, MEA capitalizes on local multiplicity independent of selection. We compared mean square errors (MSEs) in simulated HGSs from naive MLE, MEA and a conditional likelihood adjustment (CLA) method that model threshold selection bias. We observed that MEA effectively reduced MSE from MLE on null effects with or without selection, and had a clear advantage over CLA on extreme MLE estimates from null effects under lenient threshold selection in small samples, which are common among 'top' associations from a pharmacogenetics HGS.
Shimansky, Y P
2011-05-01
It is well known from numerous studies that perception can be significantly affected by intended action in many everyday situations, indicating that perception and related decision-making is not a simple, one-way sequence, but a complex iterative cognitive process. However, the underlying functional mechanisms are yet unclear. Based on an optimality approach, a quantitative computational model of one such mechanism has been developed in this study. It is assumed in the model that significant uncertainty about task-related parameters of the environment results in parameter estimation errors and an optimal control system should minimize the cost of such errors in terms of the optimality criterion. It is demonstrated that, if the cost of a parameter estimation error is significantly asymmetrical with respect to error direction, the tendency to minimize error cost creates a systematic deviation of the optimal parameter estimate from its maximum likelihood value. Consequently, optimization of parameter estimate and optimization of control action cannot be performed separately from each other under parameter uncertainty combined with asymmetry of estimation error cost, thus making the certainty equivalence principle non-applicable under those conditions. A hypothesis that not only the action, but also perception itself is biased by the above deviation of parameter estimate is supported by ample experimental evidence. The results provide important insights into the cognitive mechanisms of interaction between sensory perception and planning an action under realistic conditions. Implications for understanding related functional mechanisms of optimal control in the CNS are discussed.
Levin, Gregory P; Emerson, Sarah C; Emerson, Scott S
2014-09-01
Many papers have introduced adaptive clinical trial methods that allow modifications to the sample size based on interim estimates of treatment effect. There has been extensive commentary on type I error control and efficiency considerations, but little research on estimation after an adaptive hypothesis test. We evaluate the reliability and precision of different inferential procedures in the presence of an adaptive design with pre-specified rules for modifying the sampling plan. We extend group sequential orderings of the outcome space based on the stage at stopping, likelihood ratio statistic, and sample mean to the adaptive setting in order to compute median-unbiased point estimates, exact confidence intervals, and P-values uniformly distributed under the null hypothesis. The likelihood ratio ordering is found to average shorter confidence intervals and produce higher probabilities of P-values below important thresholds than alternative approaches. The bias adjusted mean demonstrates the lowest mean squared error among candidate point estimates. A conditional error-based approach in the literature has the benefit of being the only method that accommodates unplanned adaptations. We compare the performance of this and other methods in order to quantify the cost of failing to plan ahead in settings where adaptations could realistically be pre-specified at the design stage. We find the cost to be meaningful for all designs and treatment effects considered, and to be substantial for designs frequently proposed in the literature. © 2014, The International Biometric Society.
Comparing interval estimates for small sample ordinal CFA models
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research. PMID:26579002
Comparing interval estimates for small sample ordinal CFA models.
Natesan, Prathiba
2015-01-01
Robust maximum likelihood (RML) and asymptotically generalized least squares (AGLS) methods have been recommended for fitting ordinal structural equation models. Studies show that some of these methods underestimate standard errors. However, these studies have not investigated the coverage and bias of interval estimates. An estimate with a reasonable standard error could still be severely biased. This can only be known by systematically investigating the interval estimates. The present study compares Bayesian, RML, and AGLS interval estimates of factor correlations in ordinal confirmatory factor analysis models (CFA) for small sample data. Six sample sizes, 3 factor correlations, and 2 factor score distributions (multivariate normal and multivariate mildly skewed) were studied. Two Bayesian prior specifications, informative and relatively less informative were studied. Undercoverage of confidence intervals and underestimation of standard errors was common in non-Bayesian methods. Underestimated standard errors may lead to inflated Type-I error rates. Non-Bayesian intervals were more positive biased than negatively biased, that is, most intervals that did not contain the true value were greater than the true value. Some non-Bayesian methods had non-converging and inadmissible solutions for small samples and non-normal data. Bayesian empirical standard error estimates for informative and relatively less informative priors were closer to the average standard errors of the estimates. The coverage of Bayesian credibility intervals was closer to what was expected with overcoverage in a few cases. Although some Bayesian credibility intervals were wider, they reflected the nature of statistical uncertainty that comes with the data (e.g., small sample). Bayesian point estimates were also more accurate than non-Bayesian estimates. The results illustrate the importance of analyzing coverage and bias of interval estimates, and how ignoring interval estimates can be misleading. Therefore, editors and policymakers should continue to emphasize the inclusion of interval estimates in research.
Data Analysis & Statistical Methods for Command File Errors
NASA Technical Reports Server (NTRS)
Meshkat, Leila; Waggoner, Bruce; Bryant, Larry
2014-01-01
This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Kukush, Alexander; Shklyar, Sergiy; Masiuk, Sergii; Likhtarov, Illya; Kovgan, Lina; Carroll, Raymond J; Bouville, Andre
2011-02-16
With a binary response Y, the dose-response model under consideration is logistic in flavor with pr(Y=1 | D) = R (1+R)(-1), R = λ(0) + EAR D, where λ(0) is the baseline incidence rate and EAR is the excess absolute risk per gray. The calculated thyroid dose of a person i is expressed as Dimes=fiQi(mes)/Mi(mes). Here, Qi(mes) is the measured content of radioiodine in the thyroid gland of person i at time t(mes), Mi(mes) is the estimate of the thyroid mass, and f(i) is the normalizing multiplier. The Q(i) and M(i) are measured with multiplicative errors Vi(Q) and ViM, so that Qi(mes)=Qi(tr)Vi(Q) (this is classical measurement error model) and Mi(tr)=Mi(mes)Vi(M) (this is Berkson measurement error model). Here, Qi(tr) is the true content of radioactivity in the thyroid gland, and Mi(tr) is the true value of the thyroid mass. The error in f(i) is much smaller than the errors in ( Qi(mes), Mi(mes)) and ignored in the analysis. By means of Parametric Full Maximum Likelihood and Regression Calibration (under the assumption that the data set of true doses has lognormal distribution), Nonparametric Full Maximum Likelihood, Nonparametric Regression Calibration, and by properly tuned SIMEX method we study the influence of measurement errors in thyroid dose on the estimates of λ(0) and EAR. The simulation study is presented based on a real sample from the epidemiological studies. The doses were reconstructed in the framework of the Ukrainian-American project on the investigation of Post-Chernobyl thyroid cancers in Ukraine, and the underlying subpolulation was artificially enlarged in order to increase the statistical power. The true risk parameters were given by the values to earlier epidemiological studies, and then the binary response was simulated according to the dose-response model.
Baxter, E. J.; Keisler, R.; Dodelson, S.; ...
2015-06-22
Clusters of galaxies are expected to gravitationally lens the cosmic microwave background (CMB) and thereby generate a distinct signal in the CMB on arcminute scales. Measurements of this effect can be used to constrain the masses of galaxy clusters with CMB data alone. Here we present a measurement of lensing of the CMB by galaxy clusters using data from the South Pole Telescope (SPT). We also develop a maximum likelihood approach to extract the CMB cluster lensing signal and validate the method on mock data. We quantify the effects on our analysis of several potential sources of systematic error andmore » find that they generally act to reduce the best-fit cluster mass. It is estimated that this bias to lower cluster mass is roughly 0.85σ in units of the statistical error bar, although this estimate should be viewed as an upper limit. Furthermore, we apply our maximum likelihood technique to 513 clusters selected via their Sunyaev–Zeldovich (SZ) signatures in SPT data, and rule out the null hypothesis of no lensing at 3.1σ. The lensing-derived mass estimate for the full cluster sample is consistent with that inferred from the SZ flux: M 200,lens = 0.83 +0.38 -0.37 M 200,SZ (68% C.L., statistical error only).« less
Towards a novel look on low-frequency climate reconstructions
NASA Astrophysics Data System (ADS)
Kamenik, Christian; Goslar, Tomasz; Hicks, Sheila; Barnekow, Lena; Huusko, Antti
2010-05-01
Information on low-frequency (millennial to sub-centennial) climate change is often derived from sedimentary archives, such as peat profiles or lake sediments. Usually, these archives have non-annual and varying time resolution. Their dating is mainly based on radionuclides, which provide probabilistic age-depth relationships with complex error structures. Dating uncertainties impede the interpretation of sediment-based climate reconstructions. They complicate the calculation of time-dependent rates. In most cases, they make any calibration in time impossible. Sediment-based climate proxies are therefore often presented as a single, best-guess time series without proper calibration and error estimation. Errors along time and dating errors that propagate into the calculation of time-dependent rates are neglected. Our objective is to overcome the aforementioned limitations by using a 'swarm' or 'ensemble' of reconstructions instead of a single best-guess. The novelty of our approach is to take into account age-depth uncertainties by permuting through a large number of potential age-depth relationships of the archive of interest. For each individual permutation we can then calculate rates, calibrate proxies in time, and reconstruct the climate-state variable of interest. From the resulting swarm of reconstructions, we can derive realistic estimates of even complex error structures. The likelihood of reconstructions is visualized by a grid of two-dimensional kernels that take into account probabilities along time and the climate-state variable of interest simultaneously. For comparison and regional synthesis, likelihoods can be scored against other independent climate time series.
Bayes Error Rate Estimation Using Classifier Ensembles
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Ghosh, Joydeep
2003-01-01
The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.
Multiple Cognitive Control Effects of Error Likelihood and Conflict
Brown, Joshua W.
2010-01-01
Recent work on cognitive control has suggested a variety of performance monitoring functions of the anterior cingulate cortex, such as errors, conflict, error likelihood, and others. Given the variety of monitoring effects, a corresponding variety of control effects on behavior might be expected. This paper explores whether conflict and error likelihood produce distinct cognitive control effects on behavior, as measured by response time. A change signal task (Brown & Braver, 2005) was modified to include conditions of likely errors due to tardy as well as premature responses, in conditions with and without conflict. The results discriminate between competing hypotheses of independent vs. interacting conflict and error likelihood control effects. Specifically, the results suggest that the likelihood of premature vs. tardy response errors can lead to multiple distinct control effects, which are independent of cognitive control effects driven by response conflict. As a whole, the results point to the existence of multiple distinct cognitive control mechanisms and challenge existing models of cognitive control that incorporate only a single control signal. PMID:19030873
Stochastic control system parameter identifiability
NASA Technical Reports Server (NTRS)
Lee, C. H.; Herget, C. J.
1975-01-01
The parameter identification problem of general discrete time, nonlinear, multiple input/multiple output dynamic systems with Gaussian white distributed measurement errors is considered. The knowledge of the system parameterization was assumed to be known. Concepts of local parameter identifiability and local constrained maximum likelihood parameter identifiability were established. A set of sufficient conditions for the existence of a region of parameter identifiability was derived. A computation procedure employing interval arithmetic was provided for finding the regions of parameter identifiability. If the vector of the true parameters is locally constrained maximum likelihood (CML) identifiable, then with probability one, the vector of true parameters is a unique maximal point of the maximum likelihood function in the region of parameter identifiability and the constrained maximum likelihood estimation sequence will converge to the vector of true parameters.
Kesselmeier, Miriam; Lorenzo Bermejo, Justo
2017-11-01
Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Multiple indicators, multiple causes measurement error models
Tekwe, Carmen D.; Carter, Randy L.; Cullings, Harry M.; ...
2014-06-25
Multiple indicators, multiple causes (MIMIC) models are often employed by researchers studying the effects of an unobservable latent variable on a set of outcomes, when causes of the latent variable are observed. There are times, however, when the causes of the latent variable are not observed because measurements of the causal variable are contaminated by measurement error. The objectives of this study are as follows: (i) to develop a novel model by extending the classical linear MIMIC model to allow both Berkson and classical measurement errors, defining the MIMIC measurement error (MIMIC ME) model; (ii) to develop likelihood-based estimation methodsmore » for the MIMIC ME model; and (iii) to apply the newly defined MIMIC ME model to atomic bomb survivor data to study the impact of dyslipidemia and radiation dose on the physical manifestations of dyslipidemia. Finally, as a by-product of our work, we also obtain a data-driven estimate of the variance of the classical measurement error associated with an estimate of the amount of radiation dose received by atomic bomb survivors at the time of their exposure.« less
Multiple Indicators, Multiple Causes Measurement Error Models
Tekwe, Carmen D.; Carter, Randy L.; Cullings, Harry M.; Carroll, Raymond J.
2014-01-01
Multiple Indicators, Multiple Causes Models (MIMIC) are often employed by researchers studying the effects of an unobservable latent variable on a set of outcomes, when causes of the latent variable are observed. There are times however when the causes of the latent variable are not observed because measurements of the causal variable are contaminated by measurement error. The objectives of this paper are: (1) to develop a novel model by extending the classical linear MIMIC model to allow both Berkson and classical measurement errors, defining the MIMIC measurement error (MIMIC ME) model, (2) to develop likelihood based estimation methods for the MIMIC ME model, (3) to apply the newly defined MIMIC ME model to atomic bomb survivor data to study the impact of dyslipidemia and radiation dose on the physical manifestations of dyslipidemia. As a by-product of our work, we also obtain a data-driven estimate of the variance of the classical measurement error associated with an estimate of the amount of radiation dose received by atomic bomb survivors at the time of their exposure. PMID:24962535
Multiple indicators, multiple causes measurement error models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tekwe, Carmen D.; Carter, Randy L.; Cullings, Harry M.
Multiple indicators, multiple causes (MIMIC) models are often employed by researchers studying the effects of an unobservable latent variable on a set of outcomes, when causes of the latent variable are observed. There are times, however, when the causes of the latent variable are not observed because measurements of the causal variable are contaminated by measurement error. The objectives of this study are as follows: (i) to develop a novel model by extending the classical linear MIMIC model to allow both Berkson and classical measurement errors, defining the MIMIC measurement error (MIMIC ME) model; (ii) to develop likelihood-based estimation methodsmore » for the MIMIC ME model; and (iii) to apply the newly defined MIMIC ME model to atomic bomb survivor data to study the impact of dyslipidemia and radiation dose on the physical manifestations of dyslipidemia. Finally, as a by-product of our work, we also obtain a data-driven estimate of the variance of the classical measurement error associated with an estimate of the amount of radiation dose received by atomic bomb survivors at the time of their exposure.« less
A Statistical Approach to Passive Target Tracking.
1981-04-01
a fixed heading of 90 degrees. For 7F. A. Graybill , An Introduction to Linear Statistical Models , Vol. 1, New York: John Wiley&-Sons -Inc. (1961). 13...likelihood estimators. 12 NCSC TM 311-81 The adjustment for a changing error variance is easy using the linear model approach; i.e., use weighted
Davidov, Ori; Rosen, Sophia
2011-04-01
In medical studies, endpoints are often measured for each patient longitudinally. The mixed-effects model has been a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, in hearing loss studies, we expect hearing to deteriorate with time. This means that hearing thresholds which reflect hearing acuity will, on average, increase over time. Therefore, the regression coefficients associated with the mean effect of time on hearing ability will be constrained. Such constraints should be accounted for in the analysis. We propose maximum likelihood estimation procedures, based on the expectation-conditional maximization either algorithm, to estimate the parameters of the model while accounting for the constraints on them. The proposed methods improve, in terms of mean square error, on the unconstrained estimators. In some settings, the improvement may be substantial. Hypotheses testing procedures that incorporate the constraints are developed. Specifically, likelihood ratio, Wald, and score tests are proposed and investigated. Their empirical significance levels and power are studied using simulations. It is shown that incorporating the constraints improves the mean squared error of the estimates and the power of the tests. These improvements may be substantial. The methodology is used to analyze a hearing loss study.
NASA Astrophysics Data System (ADS)
Raghunathan, Srinivasan; Patil, Sanjaykumar; Baxter, Eric J.; Bianchini, Federico; Bleem, Lindsey E.; Crawford, Thomas M.; Holder, Gilbert P.; Manzotti, Alessandro; Reichardt, Christian L.
2017-08-01
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, we examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment's beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.
Dissociating response conflict and error likelihood in anterior cingulate cortex.
Yeung, Nick; Nieuwenhuis, Sander
2009-11-18
Neuroimaging studies consistently report activity in anterior cingulate cortex (ACC) in conditions of high cognitive demand, leading to the view that ACC plays a crucial role in the control of cognitive processes. According to one prominent theory, the sensitivity of ACC to task difficulty reflects its role in monitoring for the occurrence of competition, or "conflict," between responses to signal the need for increased cognitive control. However, a contrasting theory proposes that ACC is the recipient rather than source of monitoring signals, and that ACC activity observed in relation to task demand reflects the role of this region in learning about the likelihood of errors. Response conflict and error likelihood are typically confounded, making the theories difficult to distinguish empirically. The present research therefore used detailed computational simulations to derive contrasting predictions regarding ACC activity and error rate as a function of response speed. The simulations demonstrated a clear dissociation between conflict and error likelihood: fast response trials are associated with low conflict but high error likelihood, whereas slow response trials show the opposite pattern. Using the N2 component as an index of ACC activity, an EEG study demonstrated that when conflict and error likelihood are dissociated in this way, ACC activity tracks conflict and is negatively correlated with error likelihood. These findings support the conflict-monitoring theory and suggest that, in speeded decision tasks, ACC activity reflects current task demands rather than the retrospective coding of past performance.
The Theory and Practice of Estimating the Accuracy of Dynamic Flight-Determined Coefficients
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1981-01-01
Means of assessing the accuracy of maximum likelihood parameter estimates obtained from dynamic flight data are discussed. The most commonly used analytical predictors of accuracy are derived and compared from both statistical and simplified geometrics standpoints. The accuracy predictions are evaluated with real and simulated data, with an emphasis on practical considerations, such as modeling error. Improved computations of the Cramer-Rao bound to correct large discrepancies due to colored noise and modeling error are presented. The corrected Cramer-Rao bound is shown to be the best available analytical predictor of accuracy, and several practical examples of the use of the Cramer-Rao bound are given. Engineering judgement, aided by such analytical tools, is the final arbiter of accuracy estimation.
On-line estimation of error covariance parameters for atmospheric data assimilation
NASA Technical Reports Server (NTRS)
Dee, Dick P.
1995-01-01
A simple scheme is presented for on-line estimation of covariance parameters in statistical data assimilation systems. The scheme is based on a maximum-likelihood approach in which estimates are produced on the basis of a single batch of simultaneous observations. Simple-sample covariance estimation is reasonable as long as the number of available observations exceeds the number of tunable parameters by two or three orders of magnitude. Not much is known at present about model error associated with actual forecast systems. Our scheme can be used to estimate some important statistical model error parameters such as regionally averaged variances or characteristic correlation length scales. The advantage of the single-sample approach is that it does not rely on any assumptions about the temporal behavior of the covariance parameters: time-dependent parameter estimates can be continuously adjusted on the basis of current observations. This is of practical importance since it is likely to be the case that both model error and observation error strongly depend on the actual state of the atmosphere. The single-sample estimation scheme can be incorporated into any four-dimensional statistical data assimilation system that involves explicit calculation of forecast error covariances, including optimal interpolation (OI) and the simplified Kalman filter (SKF). The computational cost of the scheme is high but not prohibitive; on-line estimation of one or two covariance parameters in each analysis box of an operational bozed-OI system is currently feasible. A number of numerical experiments performed with an adaptive SKF and an adaptive version of OI, using a linear two-dimensional shallow-water model and artificially generated model error are described. The performance of the nonadaptive versions of these methods turns out to depend rather strongly on correct specification of model error parameters. These parameters are estimated under a variety of conditions, including uniformly distributed model error and time-dependent model error statistics.
Foot placement during error and pedal applications in naturalistic driving.
Wu, Yuqing; Boyle, Linda Ng; McGehee, Daniel; Roe, Cheryl A; Ebe, Kazutoshi; Foley, James
2017-02-01
Data from a naturalistic driving study was used to examine foot placement during routine foot pedal movements and possible pedal misapplications. The study included four weeks of observations from 30 drivers, where pedal responses were recorded and categorized. The foot movements associated with pedal misapplications and errors were the focus of the analyses. A random forest algorithm was used to predict the pedal application types based the video observations, foot placements, drivers' characteristics, drivers' cognitive function levels and anthropometric measurements. A repeated multinomial logit model was then used to estimate the likelihood of the foot placement given various driver characteristics and driving scenarios. The findings showed that prior foot location, the drivers' seat position, and the drive sequence were all associated with incorrect foot placement during an event. The study showed that there is a potential to develop a driver assistance system that can reduce the likelihood of a pedal error. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Klein, V.
1980-01-01
A frequency domain maximum likelihood method is developed for the estimation of airplane stability and control parameters from measured data. The model of an airplane is represented by a discrete-type steady state Kalman filter with time variables replaced by their Fourier series expansions. The likelihood function of innovations is formulated, and by its maximization with respect to unknown parameters the estimation algorithm is obtained. This algorithm is then simplified to the output error estimation method with the data in the form of transformed time histories, frequency response curves, or spectral and cross-spectral densities. The development is followed by a discussion on the equivalence of the cost function in the time and frequency domains, and on advantages and disadvantages of the frequency domain approach. The algorithm developed is applied in four examples to the estimation of longitudinal parameters of a general aviation airplane using computer generated and measured data in turbulent and still air. The cost functions in the time and frequency domains are shown to be equivalent; therefore, both approaches are complementary and not contradictory. Despite some computational advantages of parameter estimation in the frequency domain, this approach is limited to linear equations of motion with constant coefficients.
Transmuted of Rayleigh Distribution with Estimation and Application on Noise Signal
NASA Astrophysics Data System (ADS)
Ahmed, Suhad; Qasim, Zainab
2018-05-01
This paper deals with transforming one parameter Rayleigh distribution, into transmuted probability distribution through introducing a new parameter (λ), since this studied distribution is necessary in representing signal data distribution and failure data model the value of this transmuted parameter |λ| ≤ 1, is also estimated as well as the original parameter (⊖) by methods of moments and maximum likelihood using different sample size (n=25, 50, 75, 100) and comparing the results of estimation by statistical measure (mean square error, MSE).
Robust geostatistical analysis of spatial data
NASA Astrophysics Data System (ADS)
Papritz, Andreas; Künsch, Hans Rudolf; Schwierz, Cornelia; Stahel, Werner A.
2013-04-01
Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outliers affect the modelling of the large-scale spatial trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation (Welsh and Richardson, 1997). Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and non-sampled locations and kriging variances. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis a data set on heavy metal contamination of the soil in the vicinity of a metal smelter. Marchant, B.P. and Lark, R.M. 2007. Robust estimation of the variogram by residual maximum likelihood. Geoderma 140: 62-72. Richardson, A.M. and Welsh, A.H. 1995. Robust restricted maximum likelihood in mixed linear models. Biometrics 51: 1429-1439. Welsh, A.H. and Richardson, A.M. 1997. Approaches to the robust estimation of mixed models. In: Handbook of Statistics Vol. 15, Elsevier, pp. 343-384.
2013-01-01
are calculated from coherently -detected fields, e.g., coherent Doppler lidar . Our CRB results reveal that the best-case mean-square error scales as 1...1088 (2001). 7. K. Asaka, Y. Hirano, K. Tatsumi, K. Kasahara, and T. Tajime, “A pseudo-random frequency modulation continuous wave coherent lidar using...multiple returns,” IEEE Trans. Pattern Anal. Mach. Intell. 29, 2170–2180 (2007). 11. T. J. Karr, “Atmospheric phase error in coherent laser radar
A Bayesian estimation of the helioseismic solar age
NASA Astrophysics Data System (ADS)
Bonanno, A.; Fröhlich, H.-E.
2015-08-01
Context. The helioseismic determination of the solar age has been a subject of several studies because it provides us with an independent estimation of the age of the solar system. Aims: We present the Bayesian estimates of the helioseismic age of the Sun, which are determined by means of calibrated solar models that employ different equations of state and nuclear reaction rates. Methods: We use 17 frequency separation ratios r02(n) = (νn,l = 0-νn-1,l = 2)/(νn,l = 1-νn-1,l = 1) from 8640 days of low-ℓBiSON frequencies and consider three likelihood functions that depend on the handling of the errors of these r02(n) ratios. Moreover, we employ the 2010 CODATA recommended values for Newton's constant, solar mass, and radius to calibrate a large grid of solar models spanning a conceivable range of solar ages. Results: It is shown that the most constrained posterior distribution of the solar age for models employing Irwin EOS with NACRE reaction rates leads to t⊙ = 4.587 ± 0.007 Gyr, while models employing the Irwin EOS and Adelberger, et al. (2011, Rev. Mod. Phys., 83, 195) reaction rate have t⊙ = 4.569 ± 0.006 Gyr. Implementing OPAL EOS in the solar models results in reduced evidence ratios (Bayes factors) and leads to an age that is not consistent with the meteoritic dating of the solar system. Conclusions: An estimate of the solar age that relies on an helioseismic age indicator such as r02(n) turns out to be essentially independent of the type of likelihood function. However, with respect to model selection, abandoning any information concerning the errors of the r02(n) ratios leads to inconclusive results, and this stresses the importance of evaluating the trustworthiness of error estimates.
Change-in-ratio estimators for populations with more than two subclasses
Udevitz, Mark S.; Pollock, Kenneth H.
1991-01-01
Change-in-ratio methods have been developed to estimate the size of populations with two or three population subclasses. Most of these methods require the often unreasonable assumption of equal sampling probabilities for individuals in all subclasses. This paper presents new models based on the weaker assumption that ratios of sampling probabilities are constant over time for populations with three or more subclasses. Estimation under these models requires that a value be assumed for one of these ratios when there are two samples. Explicit expressions are given for the maximum likelihood estimators under models for two samples with three or more subclasses and for three samples with two subclasses. A numerical method using readily available statistical software is described for obtaining the estimators and their standard errors under all of the models. Likelihood ratio tests that can be used in model selection are discussed. Emphasis is on the two-sample, three-subclass models for which Monte-Carlo simulation results and an illustrative example are presented.
A Global Carbon Assimilation System using a modified EnKF assimilation method
NASA Astrophysics Data System (ADS)
Zhang, S.; Zheng, X.; Chen, Z.; Dan, B.; Chen, J. M.; Yi, X.; Wang, L.; Wu, G.
2014-10-01
A Global Carbon Assimilation System based on Ensemble Kalman filter (GCAS-EK) is developed for assimilating atmospheric CO2 abundance data into an ecosystem model to simultaneously estimate the surface carbon fluxes and atmospheric CO2 distribution. This assimilation approach is based on the ensemble Kalman filter (EnKF), but with several new developments, including using analysis states to iteratively estimate ensemble forecast errors, and a maximum likelihood estimation of the inflation factors of the forecast and observation errors. The proposed assimilation approach is tested in observing system simulation experiments and then used to estimate the terrestrial ecosystem carbon fluxes and atmospheric CO2 distributions from 2002 to 2008. The results showed that this assimilation approach can effectively reduce the biases and uncertainties of the carbon fluxes simulated by the ecosystem model.
Human Error Analysis in a Permit to Work System: A Case Study in a Chemical Plant
Jahangiri, Mehdi; Hoboubi, Naser; Rostamabadi, Akbar; Keshavarzi, Sareh; Hosseini, Ali Akbar
2015-01-01
Background A permit to work (PTW) is a formal written system to control certain types of work which are identified as potentially hazardous. However, human error in PTW processes can lead to an accident. Methods This cross-sectional, descriptive study was conducted to estimate the probability of human errors in PTW processes in a chemical plant in Iran. In the first stage, through interviewing the personnel and studying the procedure in the plant, the PTW process was analyzed using the hierarchical task analysis technique. In doing so, PTW was considered as a goal and detailed tasks to achieve the goal were analyzed. In the next step, the standardized plant analysis risk-human (SPAR-H) reliability analysis method was applied for estimation of human error probability. Results The mean probability of human error in the PTW system was estimated to be 0.11. The highest probability of human error in the PTW process was related to flammable gas testing (50.7%). Conclusion The SPAR-H method applied in this study could analyze and quantify the potential human errors and extract the required measures for reducing the error probabilities in PTW system. Some suggestions to reduce the likelihood of errors, especially in the field of modifying the performance shaping factors and dependencies among tasks are provided. PMID:27014485
MIXREG: a computer program for mixed-effects regression analysis with autocorrelated errors.
Hedeker, D; Gibbons, R D
1996-05-01
MIXREG is a program that provides estimates for a mixed-effects regression model (MRM) for normally-distributed response data including autocorrelated errors. This model can be used for analysis of unbalanced longitudinal data, where individuals may be measured at a different number of timepoints, or even at different timepoints. Autocorrelated errors of a general form or following an AR(1), MA(1), or ARMA(1,1) form are allowable. This model can also be used for analysis of clustered data, where the mixed-effects model assumes data within clusters are dependent. The degree of dependency is estimated jointly with estimates of the usual model parameters, thus adjusting for clustering. MIXREG uses maximum marginal likelihood estimation, utilizing both the EM algorithm and a Fisher-scoring solution. For the scoring solution, the covariance matrix of the random effects is expressed in its Gaussian decomposition, and the diagonal matrix reparameterized using the exponential transformation. Estimation of the individual random effects is accomplished using an empirical Bayes approach. Examples illustrating usage and features of MIXREG are provided.
Shi, Fanrong; Tuo, Xianguo; Yang, Simon X.; Li, Huailiang; Shi, Rui
2017-01-01
Wireless sensor networks (WSNs) have been widely used to collect valuable information in Structural Health Monitoring (SHM) of bridges, using various sensors, such as temperature, vibration and strain sensors. Since multiple sensors are distributed on the bridge, accurate time synchronization is very important for multi-sensor data fusion and information processing. Based on shape of the bridge, a spanning tree is employed to build linear topology WSNs and achieve time synchronization in this paper. Two-way time message exchange (TTME) and maximum likelihood estimation (MLE) are employed for clock offset estimation. Multiple TTMEs are proposed to obtain a subset of TTME observations. The time out restriction and retry mechanism are employed to avoid the estimation errors that are caused by continuous clock offset and software latencies. The simulation results show that the proposed algorithm could avoid the estimation errors caused by clock drift and minimize the estimation error due to the large random variable delay jitter. The proposed algorithm is an accurate and low complexity time synchronization algorithm for bridge health monitoring. PMID:28471418
Shi, Fanrong; Tuo, Xianguo; Yang, Simon X; Li, Huailiang; Shi, Rui
2017-05-04
Wireless sensor networks (WSNs) have been widely used to collect valuable information in Structural Health Monitoring (SHM) of bridges, using various sensors, such as temperature, vibration and strain sensors. Since multiple sensors are distributed on the bridge, accurate time synchronization is very important for multi-sensor data fusion and information processing. Based on shape of the bridge, a spanning tree is employed to build linear topology WSNs and achieve time synchronization in this paper. Two-way time message exchange (TTME) and maximum likelihood estimation (MLE) are employed for clock offset estimation. Multiple TTMEs are proposed to obtain a subset of TTME observations. The time out restriction and retry mechanism are employed to avoid the estimation errors that are caused by continuous clock offset and software latencies. The simulation results show that the proposed algorithm could avoid the estimation errors caused by clock drift and minimize the estimation error due to the large random variable delay jitter. The proposed algorithm is an accurate and low complexity time synchronization algorithm for bridge health monitoring.
NASA Astrophysics Data System (ADS)
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2016-12-01
Bayesian multimodel inference is increasingly being used in hydrology. Estimating Bayesian model evidence (BME) is of central importance in many Bayesian multimodel analysis such as Bayesian model averaging and model selection. BME is the overall probability of the model in reproducing the data, accounting for the trade-off between the goodness-of-fit and the model complexity. Yet estimating BME is challenging, especially for high dimensional problems with complex sampling space. Estimating BME using the Monte Carlo numerical methods is preferred, as the methods yield higher accuracy than semi-analytical solutions (e.g. Laplace approximations, BIC, KIC, etc.). However, numerical methods are prone the numerical demons arising from underflow of round off errors. Although few studies alluded to this issue, to our knowledge this is the first study that illustrates these numerical demons. We show that the precision arithmetic can become a threshold on likelihood values and Metropolis acceptance ratio, which results in trimming parameter regions (when likelihood function is less than the smallest floating point number that a computer can represent) and corrupting of the empirical measures of the random states of the MCMC sampler (when using log-likelihood function). We consider two of the most powerful numerical estimators of BME that are the path sampling method of thermodynamic integration (TI) and the importance sampling method of steppingstone sampling (SS). We also consider the two most widely used numerical estimators, which are the prior sampling arithmetic mean (AS) and posterior sampling harmonic mean (HM). We investigate the vulnerability of these four estimators to the numerical demons. Interesting, the most biased estimator, namely the HM, turned out to be the least vulnerable. While it is generally assumed that AM is a bias-free estimator that will always approximate the true BME by investing in computational effort, we show that arithmetic underflow can hamper AM resulting in severe underestimation of BME. TI turned out to be the most vulnerable, resulting in BME overestimation. Finally, we show how SS can be largely invariant to rounding errors, yielding the most accurate and computational efficient results. These research results are useful for MC simulations to estimate Bayesian model evidence.
On the validity of cosmological Fisher matrix forecasts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolz, Laura; Kilbinger, Martin; Weller, Jochen
2012-09-01
We present a comparison of Fisher matrix forecasts for cosmological probes with Monte Carlo Markov Chain (MCMC) posterior likelihood estimation methods. We analyse the performance of future Dark Energy Task Force (DETF) stage-III and stage-IV dark-energy surveys using supernovae, baryon acoustic oscillations and weak lensing as probes. We concentrate in particular on the dark-energy equation of state parameters w{sub 0} and w{sub a}. For purely geometrical probes, and especially when marginalising over w{sub a}, we find considerable disagreement between the two methods, since in this case the Fisher matrix can not reproduce the highly non-elliptical shape of the likelihood function.more » More quantitatively, the Fisher method underestimates the marginalized errors for purely geometrical probes between 30%-70%. For cases including structure formation such as weak lensing, we find that the posterior probability contours from the Fisher matrix estimation are in good agreement with the MCMC contours and the forecasted errors only changing on the 5% level. We then explore non-linear transformations resulting in physically-motivated parameters and investigate whether these parameterisations exhibit a Gaussian behaviour. We conclude that for the purely geometrical probes and, more generally, in cases where it is not known whether the likelihood is close to Gaussian, the Fisher matrix is not the appropriate tool to produce reliable forecasts.« less
Extending the Applicability of the Generalized Likelihood Function for Zero-Inflated Data Series
NASA Astrophysics Data System (ADS)
Oliveira, Debora Y.; Chaffe, Pedro L. B.; Sá, João. H. M.
2018-03-01
Proper uncertainty estimation for data series with a high proportion of zero and near zero observations has been a challenge in hydrologic studies. This technical note proposes a modification to the Generalized Likelihood function that accounts for zero inflation of the error distribution (ZI-GL). We compare the performance of the proposed ZI-GL with the original Generalized Likelihood function using the entire data series (GL) and by simply suppressing zero observations (GLy>0). These approaches were applied to two interception modeling examples characterized by data series with a significant number of zeros. The ZI-GL produced better uncertainty ranges than the GL as measured by the precision, reliability and volumetric bias metrics. The comparison between ZI-GL and GLy>0 highlights the need for further improvement in the treatment of residuals from near zero simulations when a linear heteroscedastic error model is considered. Aside from the interception modeling examples illustrated herein, the proposed ZI-GL may be useful for other hydrologic studies, such as for the modeling of the runoff generation in hillslopes and ephemeral catchments.
Statistical modelling of thermal annealing of fission tracks in apatite
NASA Astrophysics Data System (ADS)
Laslett, G. M.; Galbraith, R. F.
1996-12-01
We develop an improved methodology for modelling the relationship between mean track length, temperature, and time in fission track annealing experiments. We consider "fanning Arrhenius" models, in which contours of constant mean length on an Arrhenius plot are straight lines meeting at a common point. Features of our approach are explicit use of subject matter knowledge, treating mean length as the response variable, modelling of the mean-variance relationship with two components of variance, improved modelling of the control sample, and using information from experiments in which no tracks are seen. This approach overcomes several weaknesses in previous models and provides a robust six parameter model that is widely applicable. Estimation is via direct maximum likelihood which can be implemented using a standard numerical optimisation package. Because the model is highly nonlinear, some reparameterisations are needed to achieve stable estimation and calculation of precisions. Experience suggests that precisions are more convincingly estimated from profile log-likelihood functions than from the information matrix. We apply our method to the B-5 and Sr fluorapatite data of Crowley et al. (1991) and obtain well-fitting models in both cases. For the B-5 fluorapatite, our model exhibits less fanning than that of Crowley et al. (1991), although fitted mean values above 12 μm are fairly similar. However, predictions can be different, particularly for heavy annealing at geological time scales, where our model is less retentive. In addition, the refined error structure of our model results in tighter prediction errors, and has components of error that are easier to verify or modify. For the Sr fluorapatite, our fitted model for mean lengths does not differ greatly from that of Crowley et al. (1991), but our error structure is quite different.
Role-modeling and medical error disclosure: a national survey of trainees.
Martinez, William; Hickson, Gerald B; Miller, Bonnie M; Doukas, David J; Buckley, John D; Song, John; Sehgal, Niraj L; Deitz, Jennifer; Braddock, Clarence H; Lehmann, Lisa Soleymani
2014-03-01
To measure trainees' exposure to negative and positive role-modeling for responding to medical errors and to examine the association between that exposure and trainees' attitudes and behaviors regarding error disclosure. Between May 2011 and June 2012, 435 residents at two large academic medical centers and 1,187 medical students from seven U.S. medical schools received anonymous, electronic questionnaires. The questionnaire asked respondents about (1) experiences with errors, (2) training for responding to errors, (3) behaviors related to error disclosure, (4) exposure to role-modeling for responding to errors, and (5) attitudes regarding disclosure. Using multivariate regression, the authors analyzed whether frequency of exposure to negative and positive role-modeling independently predicted two primary outcomes: (1) attitudes regarding disclosure and (2) nontransparent behavior in response to a harmful error. The response rate was 55% (884/1,622). Training on how to respond to errors had the largest independent, positive effect on attitudes (standardized effect estimate, 0.32, P < .001); negative role-modeling had the largest independent, negative effect (standardized effect estimate, -0.26, P < .001). Positive role-modeling had a positive effect on attitudes (standardized effect estimate, 0.26, P < .001). Exposure to negative role-modeling was independently associated with an increased likelihood of trainees' nontransparent behavior in response to an error (OR 1.37, 95% CI 1.15-1.64; P < .001). Exposure to role-modeling predicts trainees' attitudes and behavior regarding the disclosure of harmful errors. Negative role models may be a significant impediment to disclosure among trainees.
Lu, Dan; Ye, Ming; Meyer, Philip D.; Curtis, Gary P.; Shi, Xiaoqing; Niu, Xu-Feng; Yabusaki, Steve B.
2013-01-01
When conducting model averaging for assessing groundwater conceptual model uncertainty, the averaging weights are often evaluated using model selection criteria such as AIC, AICc, BIC, and KIC (Akaike Information Criterion, Corrected Akaike Information Criterion, Bayesian Information Criterion, and Kashyap Information Criterion, respectively). However, this method often leads to an unrealistic situation in which the best model receives overwhelmingly large averaging weight (close to 100%), which cannot be justified by available data and knowledge. It was found in this study that this problem was caused by using the covariance matrix, CE, of measurement errors for estimating the negative log likelihood function common to all the model selection criteria. This problem can be resolved by using the covariance matrix, Cek, of total errors (including model errors and measurement errors) to account for the correlation between the total errors. An iterative two-stage method was developed in the context of maximum likelihood inverse modeling to iteratively infer the unknown Cek from the residuals during model calibration. The inferred Cek was then used in the evaluation of model selection criteria and model averaging weights. While this method was limited to serial data using time series techniques in this study, it can be extended to spatial data using geostatistical techniques. The method was first evaluated in a synthetic study and then applied to an experimental study, in which alternative surface complexation models were developed to simulate column experiments of uranium reactive transport. It was found that the total errors of the alternative models were temporally correlated due to the model errors. The iterative two-stage method using Cekresolved the problem that the best model receives 100% model averaging weight, and the resulting model averaging weights were supported by the calibration results and physical understanding of the alternative models. Using Cek obtained from the iterative two-stage method also improved predictive performance of the individual models and model averaging in both synthetic and experimental studies.
Likelihood-Based Random-Effect Meta-Analysis of Binary Events.
Amatya, Anup; Bhaumik, Dulal K; Normand, Sharon-Lise; Greenhouse, Joel; Kaizar, Eloise; Neelon, Brian; Gibbons, Robert D
2015-01-01
Meta-analysis has been used extensively for evaluation of efficacy and safety of medical interventions. Its advantages and utilities are well known. However, recent studies have raised questions about the accuracy of the commonly used moment-based meta-analytic methods in general and for rare binary outcomes in particular. The issue is further complicated for studies with heterogeneous effect sizes. Likelihood-based mixed-effects modeling provides an alternative to moment-based methods such as inverse-variance weighted fixed- and random-effects estimators. In this article, we compare and contrast different mixed-effect modeling strategies in the context of meta-analysis. Their performance in estimation and testing of overall effect and heterogeneity are evaluated when combining results from studies with a binary outcome. Models that allow heterogeneity in both baseline rate and treatment effect across studies have low type I and type II error rates, and their estimates are the least biased among the models considered.
Raghunathan, Srinivasan; Patil, Sanjaykumar; Baxter, Eric J.; ...
2017-08-25
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, wemore » examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment’s beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raghunathan, Srinivasan; Patil, Sanjaykumar; Baxter, Eric J.
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, wemore » examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment’s beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raghunathan, Srinivasan; Patil, Sanjaykumar; Bianchini, Federico
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, wemore » examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment's beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.« less
Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard
2016-10-01
In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value <0.001). However, the flexible piecewise exponential model showed the smallest overdispersion parameter (3.2 versus 21.3) for non-flexible piecewise exponential models. We showed that there were no major differences between methods. However, using a flexible piecewise regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.
Magis, David
2014-11-01
In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods were introduced to lessen the impact of such aberrant responses on the estimation process. The computation of asymptotic (i.e., large-sample) standard errors (ASE) for these robust estimators, however, has not yet been fully considered. This paper focuses on a broad class of robust ability estimators, defined by an appropriate selection of the weight function and the residual measure, for which the ASE is derived from the theory of estimating equations. The maximum likelihood (ML) and the robust estimators, together with their estimated ASEs, are then compared in a simulation study by generating random guessing disturbances. It is concluded that both the estimators and their ASE perform similarly in the absence of random guessing, while the robust estimator and its estimated ASE are less biased and outperform their ML counterparts in the presence of random guessing with large impact on the item response process. © 2013 The British Psychological Society.
Composite Linear Models | Division of Cancer Prevention
By Stuart G. Baker The composite linear models software is a matrix approach to compute maximum likelihood estimates and asymptotic standard errors for models for incomplete multinomial data. It implements the method described in Baker SG. Composite linear models for incomplete multinomial data. Statistics in Medicine 1994;13:609-622. The software includes a library of thirty
An Investigation of the Sample Performance of Two Nonnormality Corrections for RMSEA
ERIC Educational Resources Information Center
Brosseau-Liard, Patricia E.; Savalei, Victoria; Li, Libo
2012-01-01
The root mean square error of approximation (RMSEA) is a popular fit index in structural equation modeling (SEM). Typically, RMSEA is computed using the normal theory maximum likelihood (ML) fit function. Under nonnormality, the uncorrected sample estimate of the ML RMSEA tends to be inflated. Two robust corrections to the sample ML RMSEA have…
Williams, Robert C; Elston, Robert C; Kumar, Pankaj; Knowler, William C; Abboud, Hanna E; Adler, Sharon; Bowden, Donald W; Divers, Jasmin; Freedman, Barry I; Igo, Robert P; Ipp, Eli; Iyengar, Sudha K; Kimmel, Paul L; Klag, Michael J; Kohn, Orly; Langefeld, Carl D; Leehey, David J; Nelson, Robert G; Nicholas, Susanne B; Pahl, Madeleine V; Parekh, Rulan S; Rotter, Jerome I; Schelling, Jeffrey R; Sedor, John R; Shah, Vallabh O; Smith, Michael W; Taylor, Kent D; Thameem, Farook; Thornley-Brown, Denyse; Winkler, Cheryl A; Guo, Xiuqing; Zager, Phillip; Hanson, Robert L
2016-05-04
The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.
Improving z-tracking accuracy in the two-photon single-particle tracking microscope.
Liu, C; Liu, Y-L; Perillo, E P; Jiang, N; Dunn, A K; Yeh, H-C
2015-10-12
Here, we present a method that can improve the z-tracking accuracy of the recently invented TSUNAMI (Tracking of Single particles Using Nonlinear And Multiplexed Illumination) microscope. This method utilizes a maximum likelihood estimator (MLE) to determine the particle's 3D position that maximizes the likelihood of the observed time-correlated photon count distribution. Our Monte Carlo simulations show that the MLE-based tracking scheme can improve the z-tracking accuracy of TSUNAMI microscope by 1.7 fold. In addition, MLE is also found to reduce the temporal correlation of the z-tracking error. Taking advantage of the smaller and less temporally correlated z-tracking error, we have precisely recovered the hybridization-melting kinetics of a DNA model system from thousands of short single-particle trajectories in silico . Our method can be generally applied to other 3D single-particle tracking techniques.
Improving z-tracking accuracy in the two-photon single-particle tracking microscope
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, C.; Liu, Y.-L.; Perillo, E. P.
Here, we present a method that can improve the z-tracking accuracy of the recently invented TSUNAMI (Tracking of Single particles Using Nonlinear And Multiplexed Illumination) microscope. This method utilizes a maximum likelihood estimator (MLE) to determine the particle's 3D position that maximizes the likelihood of the observed time-correlated photon count distribution. Our Monte Carlo simulations show that the MLE-based tracking scheme can improve the z-tracking accuracy of TSUNAMI microscope by 1.7 fold. In addition, MLE is also found to reduce the temporal correlation of the z-tracking error. Taking advantage of the smaller and less temporally correlated z-tracking error, we havemore » precisely recovered the hybridization-melting kinetics of a DNA model system from thousands of short single-particle trajectories in silico. Our method can be generally applied to other 3D single-particle tracking techniques.« less
Heading Estimation for Pedestrian Dead Reckoning Based on Robust Adaptive Kalman Filtering.
Wu, Dongjin; Xia, Linyuan; Geng, Jijun
2018-06-19
Pedestrian dead reckoning (PDR) using smart phone-embedded micro-electro-mechanical system (MEMS) sensors plays a key role in ubiquitous localization indoors and outdoors. However, as a relative localization method, it suffers from the problem of error accumulation which prevents it from long term independent running. Heading estimation error is one of the main location error sources, and therefore, in order to improve the location tracking performance of the PDR method in complex environments, an approach based on robust adaptive Kalman filtering (RAKF) for estimating accurate headings is proposed. In our approach, outputs from gyroscope, accelerometer, and magnetometer sensors are fused using the solution of Kalman filtering (KF) that the heading measurements derived from accelerations and magnetic field data are used to correct the states integrated from angular rates. In order to identify and control measurement outliers, a maximum likelihood-type estimator (M-estimator)-based model is used. Moreover, an adaptive factor is applied to resist the negative effects of state model disturbances. Extensive experiments under static and dynamic conditions were conducted in indoor environments. The experimental results demonstrate the proposed approach provides more accurate heading estimates and supports more robust and dynamic adaptive location tracking, compared with methods based on conventional KF.
Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures
Moore, Brian R.; Höhna, Sebastian; May, Michael R.; Rannala, Bruce; Huelsenbeck, John P.
2016-01-01
Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM. PMID:27512038
Small area estimation for estimating the number of infant mortality in West Java, Indonesia
NASA Astrophysics Data System (ADS)
Anggreyani, Arie; Indahwati, Kurnia, Anang
2016-02-01
Demographic and Health Survey Indonesia (DHSI) is a national designed survey to provide information regarding birth rate, mortality rate, family planning and health. DHSI was conducted by BPS in cooperation with National Population and Family Planning Institution (BKKBN), Indonesia Ministry of Health (KEMENKES) and USAID. Based on the publication of DHSI 2012, the infant mortality rate for a period of five years before survey conducted is 32 for 1000 birth lives. In this paper, Small Area Estimation (SAE) is used to estimate the number of infant mortality in districts of West Java. SAE is a special model of Generalized Linear Mixed Models (GLMM). In this case, the incidence of infant mortality is a Poisson distribution which has equdispersion assumption. The methods to handle overdispersion are binomial negative and quasi-likelihood model. Based on the results of analysis, quasi-likelihood model is the best model to overcome overdispersion problem. The basic model of the small area estimation used basic area level model. Mean square error (MSE) which based on resampling method is used to measure the accuracy of small area estimates.
NASA Technical Reports Server (NTRS)
Carson, John M., III; Bayard, David S.
2006-01-01
G-SAMPLE is an in-flight dynamical method for use by sample collection missions to identify the presence and quantity of collected sample material. The G-SAMPLE method implements a maximum-likelihood estimator to identify the collected sample mass, based on onboard force sensor measurements, thruster firings, and a dynamics model of the spacecraft. With G-SAMPLE, sample mass identification becomes a computation rather than an extra hardware requirement; the added cost of cameras or other sensors for sample mass detection is avoided. Realistic simulation examples are provided for a spacecraft configuration with a sample collection device mounted on the end of an extended boom. In one representative example, a 1000 gram sample mass is estimated to within 110 grams (95% confidence) under realistic assumptions of thruster profile error, spacecraft parameter uncertainty, and sensor noise. For convenience to future mission design, an overall sample-mass estimation error budget is developed to approximate the effect of model uncertainty, sensor noise, data rate, and thrust profile error on the expected estimate of collected sample mass.
Using DNA fingerprints to infer familial relationships within NHANES III households
Katki, Hormuzd A.; Sanders, Christopher L.; Graubard, Barry I.; Bergen, Andrew W.
2009-01-01
Developing, targeting, and evaluating genomic strategies for population-based disease prevention require population-based data. In response to this urgent need, genotyping has been conducted within the Third National Health and Nutrition Examination (NHANES III), the nationally-representative household-interview health survey in the U.S. However, before these genetic analyses can occur, family relationships within households must be accurately ascertained. Unfortunately, reported family relationships within NHANES III households based on questionnaire data are incomplete and inconclusive with regards to actual biological relatedness of family members. We inferred family relationships within households using DNA fingerprints (Identifiler®) that contain the DNA loci used by law enforcement agencies for forensic identification of individuals. However, performance of these loci for relationship inference is not well understood. We evaluated two competing statistical methods for relationship inference on pairs of household members: an exact likelihood ratio relying on allele frequencies to an Identical By State (IBS) likelihood ratio that only requires matching alleles. We modified these methods to account for genotyping errors and population substructure. The two methods usually agree on the rankings of the most likely relationships. However, the IBS method underestimates the likelihood ratio by not accounting for the informativeness of matching rare alleles. The likelihood ratio is sensitive to estimates of population substructure, and parent-child relationships are sensitive to the specified genotyping error rate. These loci were unable to distinguish second-degree relationships and cousins from being unrelated. The genetic data is also useful for verifying reported relationships and identifying data quality issues. An important by-product is the first explicitly nationally-representative estimates of allele frequencies at these ubiquitous forensic loci. PMID:20664713
A Likelihood-Based Framework for Association Analysis of Allele-Specific Copy Numbers.
Hu, Y J; Lin, D Y; Sun, W; Zeng, D
2014-10-01
Copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) co-exist throughout the human genome and jointly contribute to phenotypic variations. Thus, it is desirable to consider both types of variants, as characterized by allele-specific copy numbers (ASCNs), in association studies of complex human diseases. Current SNP genotyping technologies capture the CNV and SNP information simultaneously via fluorescent intensity measurements. The common practice of calling ASCNs from the intensity measurements and then using the ASCN calls in downstream association analysis has important limitations. First, the association tests are prone to false-positive findings when differential measurement errors between cases and controls arise from differences in DNA quality or handling. Second, the uncertainties in the ASCN calls are ignored. We present a general framework for the integrated analysis of CNVs and SNPs, including the analysis of total copy numbers as a special case. Our approach combines the ASCN calling and the association analysis into a single step while allowing for differential measurement errors. We construct likelihood functions that properly account for case-control sampling and measurement errors. We establish the asymptotic properties of the maximum likelihood estimators and develop EM algorithms to implement the corresponding inference procedures. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to a genome-wide association study of schizophrenia. Extensions to next-generation sequencing data are discussed.
2013-03-21
instruments where frequency estimates are calcu- lated from coherently detected fields, e.g., coherent Doppler LIDAR . Our CRB results reveal that the best...wave coherent lidar using an optical field correlation detection method,” Opt. Rev. 5, 310–314 (1998). 8. H. P. Yuen and V. W. S. Chan, “Noise in...2170–2180 (2007). 13. T. J. Karr, “Atmospheric phase error in coherent laser radar,” IEEE Trans. Antennas Propag. 55, 1122–1133 (2007). 14. Throughout
Efficient estimation of Pareto model: Some modified percentile estimators.
Bhatti, Sajjad Haider; Hussain, Shahzad; Ahmad, Tanvir; Aslam, Muhammad; Aftab, Muhammad; Raza, Muhammad Ali
2018-01-01
The article proposes three modified percentile estimators for parameter estimation of the Pareto distribution. These modifications are based on median, geometric mean and expectation of empirical cumulative distribution function of first-order statistic. The proposed modified estimators are compared with traditional percentile estimators through a Monte Carlo simulation for different parameter combinations with varying sample sizes. Performance of different estimators is assessed in terms of total mean square error and total relative deviation. It is determined that modified percentile estimator based on expectation of empirical cumulative distribution function of first-order statistic provides efficient and precise parameter estimates compared to other estimators considered. The simulation results were further confirmed using two real life examples where maximum likelihood and moment estimators were also considered.
Henshall, John M; Dierens, Leanne; Sellars, Melony J
2014-09-02
While much attention has focused on the development of high-density single nucleotide polymorphism (SNP) assays, the costs of developing and running low-density assays have fallen dramatically. This makes it feasible to develop and apply SNP assays for agricultural species beyond the major livestock species. Although low-cost low-density assays may not have the accuracy of the high-density assays widely used in human and livestock species, we show that when combined with statistical analysis approaches that use quantitative instead of discrete genotypes, their utility may be improved. The data used in this study are from a 63-SNP marker Sequenom® iPLEX Platinum panel for the Black Tiger shrimp, for which high-density SNP assays are not currently available. For quantitative genotypes that could be estimated, in 5% of cases the most likely genotype for an individual at a SNP had a probability of less than 0.99. Matrix formulations of maximum likelihood equations for parentage assignment were developed for the quantitative genotypes and also for discrete genotypes perturbed by an assumed error term. Assignment rates that were based on maximum likelihood with quantitative genotypes were similar to those based on maximum likelihood with perturbed genotypes but, for more than 50% of cases, the two methods resulted in individuals being assigned to different families. Treating genotypes as quantitative values allows the same analysis framework to be used for pooled samples of DNA from multiple individuals. Resulting correlations between allele frequency estimates from pooled DNA and individual samples were consistently greater than 0.90, and as high as 0.97 for some pools. Estimates of family contributions to the pools based on quantitative genotypes in pooled DNA had a correlation of 0.85 with estimates of contributions from DNA-derived pedigree. Even with low numbers of SNPs of variable quality, parentage testing and family assignment from pooled samples are sufficiently accurate to provide useful information for a breeding program. Treating genotypes as quantitative values is an alternative to perturbing genotypes using an assumed error distribution, but can produce very different results. An understanding of the distribution of the error is required for SNP genotyping platforms.
Baele, Guy; Lemey, Philippe; Vansteelandt, Stijn
2013-03-06
Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation.
2013-01-01
Background Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model’s marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. Results We here assess the original ‘model-switch’ path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model’s marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. Conclusions We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation. PMID:23497171
Tests of Independence in Contingency Tables with Small Samples: A Comparison of Statistical Power.
ERIC Educational Resources Information Center
Parshall, Cynthia G.; Kromrey, Jeffrey D.
1996-01-01
Power and Type I error rates were estimated for contingency tables with small sample sizes for the following four types of tests: (1) Pearson's chi-square; (2) chi-square with Yates's continuity correction; (3) the likelihood ratio test; and (4) Fisher's Exact Test. Various marginal distributions, sample sizes, and effect sizes were examined. (SLD)
NASA Astrophysics Data System (ADS)
Ghosh, S.; Lopez-Coto, I.; Prasad, K.; Karion, A.; Mueller, K.; Gourdji, S.; Martin, C.; Whetstone, J. R.
2017-12-01
The National Institute of Standards and Technology (NIST) supports the North-East Corridor Baltimore Washington (NEC-B/W) project and Indianapolis Flux Experiment (INFLUX) aiming to quantify sources of Greenhouse Gas (GHG) emissions as well as their uncertainties. These projects employ different flux estimation methods including top-down inversion approaches. The traditional Bayesian inversion method estimates emission distributions by updating prior information using atmospheric observations of Green House Gases (GHG) coupled to an atmospheric and dispersion model. The magnitude of the update is dependent upon the observed enhancement along with the assumed errors such as those associated with prior information and the atmospheric transport and dispersion model. These errors are specified within the inversion covariance matrices. The assumed structure and magnitude of the specified errors can have large impact on the emission estimates from the inversion. The main objective of this work is to build a data-adaptive model for these covariances matrices. We construct a synthetic data experiment using a Kalman Filter inversion framework (Lopez et al., 2017) employing different configurations of transport and dispersion model and an assumed prior. Unlike previous traditional Bayesian approaches, we estimate posterior emissions using regularized sample covariance matrices associated with prior errors to investigate whether the structure of the matrices help to better recover our hypothetical true emissions. To incorporate transport model error, we use ensemble of transport models combined with space-time analytical covariance to construct a covariance that accounts for errors in space and time. A Kalman Filter is then run using these covariances along with Maximum Likelihood Estimates (MLE) of the involved parameters. Preliminary results indicate that specifying sptio-temporally varying errors in the error covariances can improve the flux estimates and uncertainties. We also demonstrate that differences between the modeled and observed meteorology can be used to predict uncertainties associated with atmospheric transport and dispersion modeling which can help improve the skill of an inversion at urban scales.
NASA Technical Reports Server (NTRS)
Howell, Leonard W., Jr.; Six, N. Frank (Technical Monitor)
2002-01-01
The Maximum Likelihood (ML) statistical theory required to estimate spectra information from an arbitrary number of astrophysics data sets produced by vastly different science instruments is developed in this paper. This theory and its successful implementation will facilitate the interpretation of spectral information from multiple astrophysics missions and thereby permit the derivation of superior spectral information based on the combination of data sets. The procedure is of significant value to both existing data sets and those to be produced by future astrophysics missions consisting of two or more detectors by allowing instrument developers to optimize each detector's design parameters through simulation studies in order to design and build complementary detectors that will maximize the precision with which the science objectives may be obtained. The benefits of this ML theory and its application is measured in terms of the reduction of the statistical errors (standard deviations) of the spectra information using the multiple data sets in concert as compared to the statistical errors of the spectra information when the data sets are considered separately, as well as any biases resulting from poor statistics in one or more of the individual data sets that might be reduced when the data sets are combined.
The Inverse Problem for Confined Aquifer Flow: Identification and Estimation With Extensions
NASA Astrophysics Data System (ADS)
Loaiciga, Hugo A.; MariñO, Miguel A.
1987-01-01
The contributions of this work are twofold. First, a methodology for estimating the elements of parameter matrices in the governing equation of flow in a confined aquifer is developed. The estimation techniques for the distributed-parameter inverse problem pertain to linear least squares and generalized least squares methods. The linear relationship among the known heads and unknown parameters of the flow equation provides the background for developing criteria for determining the identifiability status of unknown parameters. Under conditions of exact or overidentification it is possible to develop statistically consistent parameter estimators and their asymptotic distributions. The estimation techniques, namely, two-stage least squares and three stage least squares, are applied to a specific groundwater inverse problem and compared between themselves and with an ordinary least squares estimator. The three-stage estimator provides the closer approximation to the actual parameter values, but it also shows relatively large standard errors as compared to the ordinary and two-stage estimators. The estimation techniques provide the parameter matrices required to simulate the unsteady groundwater flow equation. Second, a nonlinear maximum likelihood estimation approach to the inverse problem is presented. The statistical properties of maximum likelihood estimators are derived, and a procedure to construct confidence intervals and do hypothesis testing is given. The relative merits of the linear and maximum likelihood estimators are analyzed. Other topics relevant to the identification and estimation methodologies, i.e., a continuous-time solution to the flow equation, coping with noise-corrupted head measurements, and extension of the developed theory to nonlinear cases are also discussed. A simulation study is used to evaluate the methods developed in this study.
Scanning linear estimation: improvements over region of interest (ROI) methods
NASA Astrophysics Data System (ADS)
Kupinski, Meredith K.; Clarkson, Eric W.; Barrett, Harrison H.
2013-03-01
In tomographic medical imaging, a signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is unbiased, i.e. the average estimate equals the true value. By contrast, unpredictable bias arising from the null functions of the imaging system affect standard algorithms that operate on reconstructed data. The SL method is demonstrated for two different tasks: (1) simultaneously estimating a signal’s size, location and activity; (2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution small-animal SPECT imaging system. For both tasks, the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and maximum values within the region of interest (ROI) are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor the response to therapy, the activity estimation task is repeated for three different signal sizes.
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Yiu, Sean; Tom, Brian Dm
2017-01-01
Several researchers have described two-part models with patient-specific stochastic processes for analysing longitudinal semicontinuous data. In theory, such models can offer greater flexibility than the standard two-part model with patient-specific random effects. However, in practice, the high dimensional integrations involved in the marginal likelihood (i.e. integrated over the stochastic processes) significantly complicates model fitting. Thus, non-standard computationally intensive procedures based on simulating the marginal likelihood have so far only been proposed. In this paper, we describe an efficient method of implementation by demonstrating how the high dimensional integrations involved in the marginal likelihood can be computed efficiently. Specifically, by using a property of the multivariate normal distribution and the standard marginal cumulative distribution function identity, we transform the marginal likelihood so that the high dimensional integrations are contained in the cumulative distribution function of a multivariate normal distribution, which can then be efficiently evaluated. Hence, maximum likelihood estimation can be used to obtain parameter estimates and asymptotic standard errors (from the observed information matrix) of model parameters. We describe our proposed efficient implementation procedure for the standard two-part model parameterisation and when it is of interest to directly model the overall marginal mean. The methodology is applied on a psoriatic arthritis data set concerning functional disability.
Statistical approaches to account for false-positive errors in environmental DNA samples.
Lahoz-Monfort, José J; Guillera-Arroita, Gurutzeta; Tingley, Reid
2016-05-01
Environmental DNA (eDNA) sampling is prone to both false-positive and false-negative errors. We review statistical methods to account for such errors in the analysis of eDNA data and use simulations to compare the performance of different modelling approaches. Our simulations illustrate that even low false-positive rates can produce biased estimates of occupancy and detectability. We further show that removing or classifying single PCR detections in an ad hoc manner under the suspicion that such records represent false positives, as sometimes advocated in the eDNA literature, also results in biased estimation of occupancy, detectability and false-positive rates. We advocate alternative approaches to account for false-positive errors that rely on prior information, or the collection of ancillary detection data at a subset of sites using a sampling method that is not prone to false-positive errors. We illustrate the advantages of these approaches over ad hoc classifications of detections and provide practical advice and code for fitting these models in maximum likelihood and Bayesian frameworks. Given the severe bias induced by false-negative and false-positive errors, the methods presented here should be more routinely adopted in eDNA studies. © 2015 John Wiley & Sons Ltd.
Parameter Estimation and Model Selection for Indoor Environments Based on Sparse Observations
NASA Astrophysics Data System (ADS)
Dehbi, Y.; Loch-Dehbi, S.; Plümer, L.
2017-09-01
This paper presents a novel method for the parameter estimation and model selection for the reconstruction of indoor environments based on sparse observations. While most approaches for the reconstruction of indoor models rely on dense observations, we predict scenes of the interior with high accuracy in the absence of indoor measurements. We use a model-based top-down approach and incorporate strong but profound prior knowledge. The latter includes probability density functions for model parameters and sparse observations such as room areas and the building footprint. The floorplan model is characterized by linear and bi-linear relations with discrete and continuous parameters. We focus on the stochastic estimation of model parameters based on a topological model derived by combinatorial reasoning in a first step. A Gauss-Markov model is applied for estimation and simulation of the model parameters. Symmetries are represented and exploited during the estimation process. Background knowledge as well as observations are incorporated in a maximum likelihood estimation and model selection is performed with AIC/BIC. The likelihood is also used for the detection and correction of potential errors in the topological model. Estimation results are presented and discussed.
Building unbiased estimators from non-gaussian likelihoods with application to shear estimation
Madhavacheril, Mathew S.; McDonald, Patrick; Sehgal, Neelima; ...
2015-01-15
We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong’s estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g| = 0.2.« less
Building unbiased estimators from non-Gaussian likelihoods with application to shear estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madhavacheril, Mathew S.; Sehgal, Neelima; McDonald, Patrick
2015-01-01
We develop a general framework for generating estimators of a given quantity which are unbiased to a given order in the difference between the true value of the underlying quantity and the fiducial position in theory space around which we expand the likelihood. We apply this formalism to rederive the optimal quadratic estimator and show how the replacement of the second derivative matrix with the Fisher matrix is a generic way of creating an unbiased estimator (assuming choice of the fiducial model is independent of data). Next we apply the approach to estimation of shear lensing, closely following the workmore » of Bernstein and Armstrong (2014). Our first order estimator reduces to their estimator in the limit of zero shear, but it also naturally allows for the case of non-constant shear and the easy calculation of correlation functions or power spectra using standard methods. Both our first-order estimator and Bernstein and Armstrong's estimator exhibit a bias which is quadratic in true shear. Our third-order estimator is, at least in the realm of the toy problem of Bernstein and Armstrong, unbiased to 0.1% in relative shear errors Δg/g for shears up to |g|=0.2.« less
Maximum likelihood estimation in calibrating a stereo camera setup.
Muijtjens, A M; Roos, J M; Arts, T; Hasman, A
1999-02-01
Motion and deformation of the cardiac wall may be measured by following the positions of implanted radiopaque markers in three dimensions, using two x-ray cameras simultaneously. Regularly, calibration of the position measurement system is obtained by registration of the images of a calibration object, containing 10-20 radiopaque markers at known positions. Unfortunately, an accidental change of the position of a camera after calibration requires complete recalibration. Alternatively, redundant information in the measured image positions of stereo pairs can be used for calibration. Thus, a separate calibration procedure can be avoided. In the current study a model is developed that describes the geometry of the camera setup by five dimensionless parameters. Maximum Likelihood (ML) estimates of these parameters were obtained in an error analysis. It is shown that the ML estimates can be found by application of a nonlinear least squares procedure. Compared to the standard unweighted least squares procedure, the ML method resulted in more accurate estimates without noticeable bias. The accuracy of the ML method was investigated in relation to the object aperture. The reconstruction problem appeared well conditioned as long as the object aperture is larger than 0.1 rad. The angle between the two viewing directions appeared to be the parameter that was most likely to cause major inaccuracies in the reconstruction of the 3-D positions of the markers. Hence, attempts to improve the robustness of the method should primarily focus on reduction of the error in this parameter.
Temporal rainfall estimation using input data reduction and model inversion
NASA Astrophysics Data System (ADS)
Wright, A. J.; Vrugt, J. A.; Walker, J. P.; Pauwels, V. R. N.
2016-12-01
Floods are devastating natural hazards. To provide accurate, precise and timely flood forecasts there is a need to understand the uncertainties associated with temporal rainfall and model parameters. The estimation of temporal rainfall and model parameter distributions from streamflow observations in complex dynamic catchments adds skill to current areal rainfall estimation methods, allows for the uncertainty of rainfall input to be considered when estimating model parameters and provides the ability to estimate rainfall from poorly gauged catchments. Current methods to estimate temporal rainfall distributions from streamflow are unable to adequately explain and invert complex non-linear hydrologic systems. This study uses the Discrete Wavelet Transform (DWT) to reduce rainfall dimensionality for the catchment of Warwick, Queensland, Australia. The reduction of rainfall to DWT coefficients allows the input rainfall time series to be simultaneously estimated along with model parameters. The estimation process is conducted using multi-chain Markov chain Monte Carlo simulation with the DREAMZS algorithm. The use of a likelihood function that considers both rainfall and streamflow error allows for model parameter and temporal rainfall distributions to be estimated. Estimation of the wavelet approximation coefficients of lower order decomposition structures was able to estimate the most realistic temporal rainfall distributions. These rainfall estimates were all able to simulate streamflow that was superior to the results of a traditional calibration approach. It is shown that the choice of wavelet has a considerable impact on the robustness of the inversion. The results demonstrate that streamflow data contains sufficient information to estimate temporal rainfall and model parameter distributions. The extent and variance of rainfall time series that are able to simulate streamflow that is superior to that simulated by a traditional calibration approach is a demonstration of equifinality. The use of a likelihood function that considers both rainfall and streamflow error combined with the use of the DWT as a model data reduction technique allows the joint inference of hydrologic model parameters along with rainfall.
Berglund, Lars; Garmo, Hans; Lindbäck, Johan; Svärdsudd, Kurt; Zethelius, Björn
2008-09-30
The least-squares estimator of the slope in a simple linear regression model is biased towards zero when the predictor is measured with random error. A corrected slope may be estimated by adding data from a reliability study, which comprises a subset of subjects from the main study. The precision of this corrected slope depends on the design of the reliability study and estimator choice. Previous work has assumed that the reliability study constitutes a random sample from the main study. A more efficient design is to use subjects with extreme values on their first measurement. Previously, we published a variance formula for the corrected slope, when the correction factor is the slope in the regression of the second measurement on the first. In this paper we show that both designs improve by maximum likelihood estimation (MLE). The precision gain is explained by the inclusion of data from all subjects for estimation of the predictor's variance and by the use of the second measurement for estimation of the covariance between response and predictor. The gain of MLE enhances with stronger true relationship between response and predictor and with lower precision in the predictor measurements. We present a real data example on the relationship between fasting insulin, a surrogate marker, and true insulin sensitivity measured by a gold-standard euglycaemic insulin clamp, and simulations, where the behavior of profile-likelihood-based confidence intervals is examined. MLE was shown to be a robust estimator for non-normal distributions and efficient for small sample situations. Copyright (c) 2008 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza
2018-03-01
In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
Gilliom, Robert J.; Helsel, Dennis R.
1986-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1986-02-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
Estimation of brood and nest survival: Comparative methods in the presence of heterogeneity
Manly, Bryan F.J.; Schmutz, Joel A.
2001-01-01
The Mayfield method has been widely used for estimating survival of nests and young animals, especially when data are collected at irregular observation intervals. However, this method assumes survival is constant throughout the study period, which often ignores biologically relevant variation and may lead to biased survival estimates. We examined the bias and accuracy of 1 modification to the Mayfield method that allows for temporal variation in survival, and we developed and similarly tested 2 additional methods. One of these 2 new methods is simply an iterative extension of Klett and Johnson's method, which we refer to as the Iterative Mayfield method and bears similarity to Kaplan-Meier methods. The other method uses maximum likelihood techniques for estimation and is best applied to survival of animals in groups or families, rather than as independent individuals. We also examined how robust these estimators are to heterogeneity in the data, which can arise from such sources as dependent survival probabilities among siblings, inherent differences among families, and adoption. Testing of estimator performance with respect to bias, accuracy, and heterogeneity was done using simulations that mimicked a study of survival of emperor goose (Chen canagica) goslings. Assuming constant survival for inappropriately long periods of time or use of Klett and Johnson's methods resulted in large bias or poor accuracy (often >5% bias or root mean square error) compared to our Iterative Mayfield or maximum likelihood methods. Overall, estimator performance was slightly better with our Iterative Mayfield than our maximum likelihood method, but the maximum likelihood method provides a more rigorous framework for testing covariates and explicity models a heterogeneity factor. We demonstrated use of all estimators with data from emperor goose goslings. We advocate that future studies use the new methods outlined here rather than the traditional Mayfield method or its previous modifications.
Leak Detection and Location of Water Pipes Using Vibration Sensors and Modified ML Prefilter.
Choi, Jihoon; Shin, Joonho; Song, Choonggeun; Han, Suyong; Park, Doo Il
2017-09-13
This paper proposes a new leak detection and location method based on vibration sensors and generalised cross-correlation techniques. Considering the estimation errors of the power spectral densities (PSDs) and the cross-spectral density (CSD), the proposed method employs a modified maximum-likelihood (ML) prefilter with a regularisation factor. We derive a theoretical variance of the time difference estimation error through summation in the discrete-frequency domain, and find the optimal regularisation factor that minimises the theoretical variance in practical water pipe channels. The proposed method is compared with conventional correlation-based techniques via numerical simulations using a water pipe channel model, and it is shown through field measurement that the proposed modified ML prefilter outperforms conventional prefilters for the generalised cross-correlation. In addition, we provide a formula to calculate the leak location using the time difference estimate when different types of pipes are connected.
Accounting for dropout bias using mixed-effects models.
Mallinckrodt, C H; Clark, W S; David, S R
2001-01-01
Treatment effects are often evaluated by comparing change over time in outcome measures. However, valid analyses of longitudinal data can be problematic when subjects discontinue (dropout) prior to completing the study. This study assessed the merits of likelihood-based repeated measures analyses (MMRM) compared with fixed-effects analysis of variance where missing values were imputed using the last observation carried forward approach (LOCF) in accounting for dropout bias. Comparisons were made in simulated data and in data from a randomized clinical trial. Subject dropout was introduced in the simulated data to generate ignorable and nonignorable missingness. Estimates of treatment group differences in mean change from baseline to endpoint from MMRM were, on average, markedly closer to the true value than estimates from LOCF in every scenario simulated. Standard errors and confidence intervals from MMRM accurately reflected the uncertainty of the estimates, whereas standard errors and confidence intervals from LOCF underestimated uncertainty.
Leak Detection and Location of Water Pipes Using Vibration Sensors and Modified ML Prefilter
Shin, Joonho; Song, Choonggeun; Han, Suyong; Park, Doo Il
2017-01-01
This paper proposes a new leak detection and location method based on vibration sensors and generalised cross-correlation techniques. Considering the estimation errors of the power spectral densities (PSDs) and the cross-spectral density (CSD), the proposed method employs a modified maximum-likelihood (ML) prefilter with a regularisation factor. We derive a theoretical variance of the time difference estimation error through summation in the discrete-frequency domain, and find the optimal regularisation factor that minimises the theoretical variance in practical water pipe channels. The proposed method is compared with conventional correlation-based techniques via numerical simulations using a water pipe channel model, and it is shown through field measurement that the proposed modified ML prefilter outperforms conventional prefilters for the generalised cross-correlation. In addition, we provide a formula to calculate the leak location using the time difference estimate when different types of pipes are connected. PMID:28902154
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.
Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram
2017-02-01
In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.
Object recognition and localization from 3D point clouds by maximum-likelihood estimation
NASA Astrophysics Data System (ADS)
Dantanarayana, Harshana G.; Huntley, Jonathan M.
2017-08-01
We present an algorithm based on maximum-likelihood analysis for the automated recognition of objects, and estimation of their pose, from 3D point clouds. Surfaces segmented from depth images are used as the features, unlike `interest point'-based algorithms which normally discard such data. Compared to the 6D Hough transform, it has negligible memory requirements, and is computationally efficient compared to iterative closest point algorithms. The same method is applicable to both the initial recognition/pose estimation problem as well as subsequent pose refinement through appropriate choice of the dispersion of the probability density functions. This single unified approach therefore avoids the usual requirement for different algorithms for these two tasks. In addition to the theoretical description, a simple 2 degrees of freedom (d.f.) example is given, followed by a full 6 d.f. analysis of 3D point cloud data from a cluttered scene acquired by a projected fringe-based scanner, which demonstrated an RMS alignment error as low as 0.3 mm.
Balakrishnan, Narayanaswamy; Pal, Suvra
2016-08-01
Recently, a flexible cure rate survival model has been developed by assuming the number of competing causes of the event of interest to follow the Conway-Maxwell-Poisson distribution. This model includes some of the well-known cure rate models discussed in the literature as special cases. Data obtained from cancer clinical trials are often right censored and expectation maximization algorithm can be used in this case to efficiently estimate the model parameters based on right censored data. In this paper, we consider the competing cause scenario and assuming the time-to-event to follow the Weibull distribution, we derive the necessary steps of the expectation maximization algorithm for estimating the parameters of different cure rate survival models. The standard errors of the maximum likelihood estimates are obtained by inverting the observed information matrix. The method of inference developed here is examined by means of an extensive Monte Carlo simulation study. Finally, we illustrate the proposed methodology with a real data on cancer recurrence. © The Author(s) 2013.
A simple, remote, video based breathing monitor.
Regev, Nir; Wulich, Dov
2017-07-01
Breathing monitors have become the all-important cornerstone of a wide variety of commercial and personal safety applications, ranging from elderly care to baby monitoring. Many such monitors exist in the market, some, with vital signs monitoring capabilities, but none remote. This paper presents a simple, yet efficient, real time method of extracting the subject's breathing sinus rhythm. Points of interest are detected on the subject's body, and the corresponding optical flow is estimated and tracked using the well known Lucas-Kanade algorithm on a frame by frame basis. A generalized likelihood ratio test is then utilized on each of the many interest points to detect which is moving in harmonic fashion. Finally, a spectral estimation algorithm based on Pisarenko harmonic decomposition tracks the harmonic frequency in real time, and a fusion maximum likelihood algorithm optimally estimates the breathing rate using all points considered. The results show a maximal error of 1 BPM between the true breathing rate and the algorithm's calculated rate, based on experiments on two babies and three adults.
Estimation of correlation functions by stochastic approximation.
NASA Technical Reports Server (NTRS)
Habibi, A.; Wintz, P. A.
1972-01-01
Consideration of the autocorrelation function of a zero-mean stationary random process. The techniques are applicable to processes with nonzero mean provided the mean is estimated first and subtracted. Two recursive techniques are proposed, both of which are based on the method of stochastic approximation and assume a functional form for the correlation function that depends on a number of parameters that are recursively estimated from successive records. One technique uses a standard point estimator of the correlation function to provide estimates of the parameters that minimize the mean-square error between the point estimates and the parametric function. The other technique provides estimates of the parameters that maximize a likelihood function relating the parameters of the function to the random process. Examples are presented.
Simultaneous Control of Error Rates in fMRI Data Analysis
Kang, Hakmook; Blume, Jeffrey; Ombao, Hernando; Badre, David
2015-01-01
The key idea of statistical hypothesis testing is to fix, and thereby control, the Type I error (false positive) rate across samples of any size. Multiple comparisons inflate the global (family-wise) Type I error rate and the traditional solution to maintaining control of the error rate is to increase the local (comparison-wise) Type II error (false negative) rates. However, in the analysis of human brain imaging data, the number of comparisons is so large that this solution breaks down: the local Type II error rate ends up being so large that scientifically meaningful analysis is precluded. Here we propose a novel solution to this problem: allow the Type I error rate to converge to zero along with the Type II error rate. It works because when the Type I error rate per comparison is very small, the accumulation (or global) Type I error rate is also small. This solution is achieved by employing the Likelihood paradigm, which uses likelihood ratios to measure the strength of evidence on a voxel-by-voxel basis. In this paper, we provide theoretical and empirical justification for a likelihood approach to the analysis of human brain imaging data. In addition, we present extensive simulations that show the likelihood approach is viable, leading to ‘cleaner’ looking brain maps and operationally superiority (lower average error rate). Finally, we include a case study on cognitive control related activation in the prefrontal cortex of the human brain. PMID:26272730
Multistage classification of multispectral Earth observational data: The design approach
NASA Technical Reports Server (NTRS)
Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.
1981-01-01
An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.
Estimating pole/zero errors in GSN-IRIS/USGS network calibration metadata
Ringler, A.T.; Hutt, C.R.; Aster, R.; Bolton, H.; Gee, L.S.; Storm, T.
2012-01-01
Mapping the digital record of a seismograph into true ground motion requires the correction of the data by some description of the instrument's response. For the Global Seismographic Network (Butler et al., 2004), as well as many other networks, this instrument response is represented as a Laplace domain pole–zero model and published in the Standard for the Exchange of Earthquake Data (SEED) format. This Laplace representation assumes that the seismometer behaves as a linear system, with any abrupt changes described adequately via multiple time-invariant epochs. The SEED format allows for published instrument response errors as well, but these typically have not been estimated or provided to users. We present an iterative three-step method to estimate the instrument response parameters (poles and zeros) and their associated errors using random calibration signals. First, we solve a coarse nonlinear inverse problem using a least-squares grid search to yield a first approximation to the solution. This approach reduces the likelihood of poorly estimated parameters (a local-minimum solution) caused by noise in the calibration records and enhances algorithm convergence. Second, we iteratively solve a nonlinear parameter estimation problem to obtain the least-squares best-fit Laplace pole–zero–gain model. Third, by applying the central limit theorem, we estimate the errors in this pole–zero model by solving the inverse problem at each frequency in a two-thirds octave band centered at each best-fit pole–zero frequency. This procedure yields error estimates of the 99% confidence interval. We demonstrate the method by applying it to a number of recent Incorporated Research Institutions in Seismology/United States Geological Survey (IRIS/USGS) network calibrations (network code IU).
Quantifying uncertainty in geoacoustic inversion. II. Application to broadband, shallow-water data.
Dosso, Stan E; Nielsen, Peter L
2002-01-01
This paper applies the new method of fast Gibbs sampling (FGS) to estimate the uncertainties of seabed geoacoustic parameters in a broadband, shallow-water acoustic survey, with the goal of interpreting the survey results and validating the method for experimental data. FGS applies a Bayesian approach to geoacoustic inversion based on sampling the posterior probability density to estimate marginal probability distributions and parameter covariances. This requires knowledge of the statistical distribution of the data errors, including both measurement and theory errors, which is generally not available. Invoking the simplifying assumption of independent, identically distributed Gaussian errors allows a maximum-likelihood estimate of the data variance and leads to a practical inversion algorithm. However, it is necessary to validate these assumptions, i.e., to verify that the parameter uncertainties obtained represent meaningful estimates. To this end, FGS is applied to a geoacoustic experiment carried out at a site off the west coast of Italy where previous acoustic and geophysical studies have been performed. The parameter uncertainties estimated via FGS are validated by comparison with: (i) the variability in the results of inverting multiple independent data sets collected during the experiment; (ii) the results of FGS inversion of synthetic test cases designed to simulate the experiment and data errors; and (iii) the available geophysical ground truth. Comparisons are carried out for a number of different source bandwidths, ranges, and levels of prior information, and indicate that FGS provides reliable and stable uncertainty estimates for the geoacoustic inverse problem.
Rayan, D Mark; Mohamad, Shariff Wan; Dorward, Leejiah; Aziz, Sheema Abdul; Clements, Gopalasamy Reuben; Christopher, Wong Chai Thiam; Traeholt, Carl; Magintan, David
2012-12-01
The endangered Asian tapir (Tapirus indicus) is threatened by large-scale habitat loss, forest fragmentation and increased hunting pressure. Conservation planning for this species, however, is hampered by a severe paucity of information on its ecology and population status. We present the first Asian tapir population density estimate from a camera trapping study targeting tigers in a selectively logged forest within Peninsular Malaysia using a spatially explicit capture-recapture maximum likelihood based framework. With a trap effort of 2496 nights, 17 individuals were identified corresponding to a density (standard error) estimate of 9.49 (2.55) adult tapirs/100 km(2) . Although our results include several caveats, we believe that our density estimate still serves as an important baseline to facilitate the monitoring of tapir population trends in Peninsular Malaysia. Our study also highlights the potential of extracting vital ecological and population information for other cryptic individually identifiable animals from tiger-centric studies, especially with the use of a spatially explicit capture-recapture maximum likelihood based framework. © 2012 Wiley Publishing Asia Pty Ltd, ISZS and IOZ/CAS.
Eberhard, Wynn L
2017-04-01
The maximum likelihood estimator (MLE) is derived for retrieving the extinction coefficient and zero-range intercept in the lidar slope method in the presence of random and independent Gaussian noise. Least-squares fitting, weighted by the inverse of the noise variance, is equivalent to the MLE. Monte Carlo simulations demonstrate that two traditional least-squares fitting schemes, which use different weights, are less accurate. Alternative fitting schemes that have some positive attributes are introduced and evaluated. The principal factors governing accuracy of all these schemes are elucidated. Applying these schemes to data with Poisson rather than Gaussian noise alters accuracy little, even when the signal-to-noise ratio is low. Methods to estimate optimum weighting factors in actual data are presented. Even when the weighting estimates are coarse, retrieval accuracy declines only modestly. Mathematical tools are described for predicting retrieval accuracy. Least-squares fitting with inverse variance weighting has optimum accuracy for retrieval of parameters from single-wavelength lidar measurements when noise, errors, and uncertainties are Gaussian distributed, or close to optimum when only approximately Gaussian.
Optimal designs based on the maximum quasi-likelihood estimator
Shen, Gang; Hyun, Seung Won; Wong, Weng Kee
2016-01-01
We use optimal design theory and construct locally optimal designs based on the maximum quasi-likelihood estimator (MqLE), which is derived under less stringent conditions than those required for the MLE method. We show that the proposed locally optimal designs are asymptotically as efficient as those based on the MLE when the error distribution is from an exponential family, and they perform just as well or better than optimal designs based on any other asymptotically linear unbiased estimators such as the least square estimator (LSE). In addition, we show current algorithms for finding optimal designs can be directly used to find optimal designs based on the MqLE. As an illustrative application, we construct a variety of locally optimal designs based on the MqLE for the 4-parameter logistic (4PL) model and study their robustness properties to misspecifications in the model using asymptotic relative efficiency. The results suggest that optimal designs based on the MqLE can be easily generated and they are quite robust to mis-specification in the probability distribution of the responses. PMID:28163359
Estimation of distributional parameters for censored trace-level water-quality data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1984-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
Ringing Artefact Reduction By An Efficient Likelihood Improvement Method
NASA Astrophysics Data System (ADS)
Fuderer, Miha
1989-10-01
In MR imaging, the extent of the acquired spatial frequencies of the object is necessarily finite. The resulting image shows artefacts caused by "truncation" of its Fourier components. These are known as Gibbs artefacts or ringing artefacts. These artefacts are particularly. visible when the time-saving reduced acquisition method is used, say, when scanning only the lowest 70% of the 256 data lines. Filtering the data results in loss of resolution. A method is described that estimates the high frequency data from the low-frequency data lines, with the likelihood of the image as criterion. It is a computationally very efficient method, since it requires practically only two extra Fourier transforms, in addition to the normal. reconstruction. The results of this method on MR images of human subjects are promising. Evaluations on a 70% acquisition image show about 20% decrease of the error energy after processing. "Error energy" is defined as the total power of the difference to a 256-data-lines reference image. The elimination of ringing artefacts then appears almost complete..
Encircling the dark: constraining dark energy via cosmic density in spheres
NASA Astrophysics Data System (ADS)
Codis, S.; Pichon, C.; Bernardeau, F.; Uhlemann, C.; Prunet, S.
2016-08-01
The recently published analytic probability density function for the mildly non-linear cosmic density field within spherical cells is used to build a simple but accurate maximum likelihood estimate for the redshift evolution of the variance of the density, which, as expected, is shown to have smaller relative error than the sample variance. This estimator provides a competitive probe for the equation of state of dark energy, reaching a few per cent accuracy on wp and wa for a Euclid-like survey. The corresponding likelihood function can take into account the configuration of the cells via their relative separations. A code to compute one-cell-density probability density functions for arbitrary initial power spectrum, top-hat smoothing and various spherical-collapse dynamics is made available online, so as to provide straightforward means of testing the effect of alternative dark energy models and initial power spectra on the low-redshift matter distribution.
Statistical inference for template aging
NASA Astrophysics Data System (ADS)
Schuckers, Michael E.
2006-04-01
A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.
NASA Astrophysics Data System (ADS)
Wu, Guocan; Zheng, Xiaogu; Dan, Bo
2016-04-01
The shallow soil moisture observations are assimilated into Common Land Model (CoLM) to estimate the soil moisture in different layers. The forecast error is inflated to improve the analysis state accuracy and the water balance constraint is adopted to reduce the water budget residual in the assimilation procedure. The experiment results illustrate that the adaptive forecast error inflation can reduce the analysis error, while the proper inflation layer can be selected based on the -2log-likelihood function of the innovation statistic. The water balance constraint can result in reducing water budget residual substantially, at a low cost of assimilation accuracy loss. The assimilation scheme can be potentially applied to assimilate the remote sensing data.
NASA Technical Reports Server (NTRS)
Davis, John H.
1993-01-01
Lunar spherical harmonic gravity coefficients are estimated from simulated observations of a near-circular low altitude polar orbiter disturbed by lunar mascons. Lunar gravity sensing missions using earth-based nearside observations with and without satellite-based far-side observations are simulated and least squares maximum likelihood estimates are developed for spherical harmonic expansion fit models. Simulations and parameter estimations are performed by a modified version of the Smithsonian Astrophysical Observatory's Planetary Ephemeris Program. Two different lunar spacecraft mission phases are simulated to evaluate the estimated fit models. Results for predicting state covariances one orbit ahead are presented along with the state errors resulting from the mismodeled gravity field. The position errors from planning a lunar landing maneuver with a mismodeled gravity field are also presented. These simulations clearly demonstrate the need to include observations of satellite motion over the far side in estimating the lunar gravity field. The simulations also illustrate that the eighth degree and order expansions used in the simulated fits were unable to adequately model lunar mascons.
Variational Bayesian Parameter Estimation Techniques for the General Linear Model
Starke, Ludger; Ostwald, Dirk
2017-01-01
Variational Bayes (VB), variational maximum likelihood (VML), restricted maximum likelihood (ReML), and maximum likelihood (ML) are cornerstone parametric statistical estimation techniques in the analysis of functional neuroimaging data. However, the theoretical underpinnings of these model parameter estimation techniques are rarely covered in introductory statistical texts. Because of the widespread practical use of VB, VML, ReML, and ML in the neuroimaging community, we reasoned that a theoretical treatment of their relationships and their application in a basic modeling scenario may be helpful for both neuroimaging novices and practitioners alike. In this technical study, we thus revisit the conceptual and formal underpinnings of VB, VML, ReML, and ML and provide a detailed account of their mathematical relationships and implementational details. We further apply VB, VML, ReML, and ML to the general linear model (GLM) with non-spherical error covariance as commonly encountered in the first-level analysis of fMRI data. To this end, we explicitly derive the corresponding free energy objective functions and ensuing iterative algorithms. Finally, in the applied part of our study, we evaluate the parameter and model recovery properties of VB, VML, ReML, and ML, first in an exemplary setting and then in the analysis of experimental fMRI data acquired from a single participant under visual stimulation. PMID:28966572
Bayesian Methods for the Physical Sciences. Learning from Examples in Astronomy and Physics.
NASA Astrophysics Data System (ADS)
Andreon, Stefano; Weaver, Brian
2015-05-01
Chapter 1: This chapter presents some basic steps for performing a good statistical analysis, all summarized in about one page. Chapter 2: This short chapter introduces the basics of probability theory inan intuitive fashion using simple examples. It also illustrates, again with examples, how to propagate errors and the difference between marginal and profile likelihoods. Chapter 3: This chapter introduces the computational tools and methods that we use for sampling from the posterior distribution. Since all numerical computations, and Bayesian ones are no exception, may end in errors, we also provide a few tips to check that the numerical computation is sampling from the posterior distribution. Chapter 4: Many of the concepts of building, running, and summarizing the resultsof a Bayesian analysis are described with this step-by-step guide using a basic (Gaussian) model. The chapter also introduces examples using Poisson and Binomial likelihoods, and how to combine repeated independent measurements. Chapter 5: All statistical analyses make assumptions, and Bayesian analyses are no exception. This chapter emphasizes that results depend on data and priors (assumptions). We illustrate this concept with examples where the prior plays greatly different roles, from major to negligible. We also provide some advice on how to look for information useful for sculpting the prior. Chapter 6: In this chapter we consider examples for which we want to estimate more than a single parameter. These common problems include estimating location and spread. We also consider examples that require the modeling of two populations (one we are interested in and a nuisance population) or averaging incompatible measurements. We also introduce quite complex examples dealing with upper limits and with a larger-than-expected scatter. Chapter 7: Rarely is a sample randomly selected from the population we wish to study. Often, samples are affected by selection effects, e.g., easier-to-collect events or objects are over-represented in samples and difficult-to-collect are under-represented if not missing altogether. In this chapter we show how to account for non-random data collection to infer the properties of the population from the studied sample. Chapter 8: In this chapter we introduce regression models, i.e., how to fit (regress) one, or more quantities, against each other through a functional relationship and estimate any unknown parameters that dictate this relationship. Questions of interest include: how to deal with samples affected by selection effects? How does a rich data structure influence the fitted parameters? And what about non-linear multiple-predictor fits, upper/lower limits, measurements errors of different amplitudes and an intrinsic variety in the studied populations or an extra source of variability? A number of examples illustrate how to answer these questions and how to predict the value of an unavailable quantity by exploiting the existence of a trend with another, available, quantity. Chapter 9: This chapter provides some advice on how the careful scientist should perform model checking and sensitivity analysis, i.e., how to answer the following questions: is the considered model at odds with the current available data (the fitted data), for example because it is over-simplified compared to some specific complexity pointed out by the data? Furthermore, are the data informative about the quantity being measured or are results sensibly dependent on details of the fitted model? And, finally, what about if assumptions are uncertain? A number of examples illustrate how to answer these questions. Chapter 10: This chapter compares the performance of Bayesian methods against simple, non-Bayesian alternatives, such as maximum likelihood, minimal chi square, ordinary and weighted least square, bivariate correlated errors and intrinsic scatter, and robust estimates of location and scale. Performances are evaluated in terms of quality of the prediction, accuracy of the estimates, and fairness and noisiness of the quoted errors. We also focus on three failures of maximum likelihood methods occurring with small samples, with mixtures, and with regressions with errors in the predictor quantity.
Quantum State Tomography via Linear Regression Estimation
Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan
2013-01-01
A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Abstract: Inference and Interval Estimation for Indirect Effects With Latent Variable Models.
Falk, Carl F; Biesanz, Jeremy C
2011-11-30
Models specifying indirect effects (or mediation) and structural equation modeling are both popular in the social sciences. Yet relatively little research has compared methods that test for indirect effects among latent variables and provided precise estimates of the effectiveness of different methods. This simulation study provides an extensive comparison of methods for constructing confidence intervals and for making inferences about indirect effects with latent variables. We compared the percentile (PC) bootstrap, bias-corrected (BC) bootstrap, bias-corrected accelerated (BC a ) bootstrap, likelihood-based confidence intervals (Neale & Miller, 1997), partial posterior predictive (Biesanz, Falk, and Savalei, 2010), and joint significance tests based on Wald tests or likelihood ratio tests. All models included three reflective latent variables representing the independent, dependent, and mediating variables. The design included the following fully crossed conditions: (a) sample size: 100, 200, and 500; (b) number of indicators per latent variable: 3 versus 5; (c) reliability per set of indicators: .7 versus .9; (d) and 16 different path combinations for the indirect effect (α = 0, .14, .39, or .59; and β = 0, .14, .39, or .59). Simulations were performed using a WestGrid cluster of 1680 3.06GHz Intel Xeon processors running R and OpenMx. Results based on 1,000 replications per cell and 2,000 resamples per bootstrap method indicated that the BC and BC a bootstrap methods have inflated Type I error rates. Likelihood-based confidence intervals and the PC bootstrap emerged as methods that adequately control Type I error and have good coverage rates.
Approximate maximum likelihood decoding of block codes
NASA Technical Reports Server (NTRS)
Greenberger, H. J.
1979-01-01
Approximate maximum likelihood decoding algorithms, based upon selecting a small set of candidate code words with the aid of the estimated probability of error of each received symbol, can give performance close to optimum with a reasonable amount of computation. By combining the best features of various algorithms and taking care to perform each step as efficiently as possible, a decoding scheme was developed which can decode codes which have better performance than those presently in use and yet not require an unreasonable amount of computation. The discussion of the details and tradeoffs of presently known efficient optimum and near optimum decoding algorithms leads, naturally, to the one which embodies the best features of all of them.
Kuwabara, Masaru; Mansouri, Farshad A.; Buckley, Mark J.
2014-01-01
Monkeys were trained to select one of three targets by matching in color or matching in shape to a sample. Because the matching rule frequently changed and there were no cues for the currently relevant rule, monkeys had to maintain the relevant rule in working memory to select the correct target. We found that monkeys' error commission was not limited to the period after the rule change and occasionally occurred even after several consecutive correct trials, indicating that the task was cognitively demanding. In trials immediately after such error trials, monkeys' speed of selecting targets was slower. Additionally, in trials following consecutive correct trials, the monkeys' target selections for erroneous responses were slower than those for correct responses. We further found evidence for the involvement of the cortex in the anterior cingulate sulcus (ACCs) in these error-related behavioral modulations. First, ACCs cell activity differed between after-error and after-correct trials. In another group of ACCs cells, the activity differed depending on whether the monkeys were making a correct or erroneous decision in target selection. Second, bilateral ACCs lesions significantly abolished the response slowing both in after-error trials and in error trials. The error likelihood in after-error trials could be inferred by the error feedback in the previous trial, whereas the likelihood of erroneous responses after consecutive correct trials could be monitored only internally. These results suggest that ACCs represent both context-dependent and internally detected error likelihoods and promote modes of response selections in situations that involve these two types of error likelihood. PMID:24872558
Martire, Kristy A; Growns, Bethany; Navarro, Danielle J
2018-04-17
Forensic handwriting examiners currently testify to the origin of questioned handwriting for legal purposes. However, forensic scientists are increasingly being encouraged to assign probabilities to their observations in the form of a likelihood ratio. This study is the first to examine whether handwriting experts are able to estimate the frequency of US handwriting features more accurately than novices. The results indicate that the absolute error for experts was lower than novices, but the size of the effect is modest, and the overall error rate even for experts is large enough as to raise questions about whether their estimates can be sufficiently trustworthy for presentation in courts. When errors are separated into effects caused by miscalibration and those caused by imprecision, we find systematic differences between individuals. Finally, we consider several ways of aggregating predictions from multiple experts, suggesting that quite substantial improvements in expert predictions are possible when a suitable aggregation method is used.
An, Lihua; Fung, Karen Y; Krewski, Daniel
2010-09-01
Spontaneous adverse event reporting systems are widely used to identify adverse reactions to drugs following their introduction into the marketplace. In this article, a James-Stein type shrinkage estimation strategy was developed in a Bayesian logistic regression model to analyze pharmacovigilance data. This method is effective in detecting signals as it combines information and borrows strength across medically related adverse events. Computer simulation demonstrated that the shrinkage estimator is uniformly better than the maximum likelihood estimator in terms of mean squared error. This method was used to investigate the possible association of a series of diabetic drugs and the risk of cardiovascular events using data from the Canada Vigilance Online Database.
NASA Technical Reports Server (NTRS)
Tsaoussi, Lucia S.; Koblinsky, Chester J.
1994-01-01
In order to facilitate the use of satellite-derived sea surface topography and velocity oceanographic models, methodology is presented for deriving the total error covariance and its geographic distribution from TOPEX/POSEIDON measurements. The model is formulated using a parametric model fit to the altimeter range observations. The topography and velocity modeled with spherical harmonic expansions whose coefficients are found through optimal adjustment to the altimeter range residuals using Bayesian statistics. All other parameters, including the orbit, geoid, surface models, and range corrections are provided as unadjusted parameters. The maximum likelihood estimates and errors are derived from the probability density function of the altimeter range residuals conditioned with a priori information. Estimates of model errors for the unadjusted parameters are obtained from the TOPEX/POSEIDON postlaunch verification results and the error covariances for the orbit and the geoid, except for the ocean tides. The error in the ocean tides is modeled, first, as the difference between two global tide models and, second, as the correction to the present tide model, the correction derived from the TOPEX/POSEIDON data. A formal error covariance propagation scheme is used to derive the total error. Our global total error estimate for the TOPEX/POSEIDON topography relative to the geoid for one 10-day period is found tio be 11 cm RMS. When the error in the geoid is removed, thereby providing an estimate of the time dependent error, the uncertainty in the topography is 3.5 cm root mean square (RMS). This level of accuracy is consistent with direct comparisons of TOPEX/POSEIDON altimeter heights with tide gauge measurements at 28 stations. In addition, the error correlation length scales are derived globally in both east-west and north-south directions, which should prove useful for data assimilation. The largest error correlation length scales are found in the tropics. Errors in the velocity field are smallest in midlatitude regions. For both variables the largest errors caused by uncertainty in the geoid. More accurate representations of the geoid await a dedicated geopotential satellite mission. Substantial improvements in the accuracy of ocean tide models are expected in the very near future from research with TOPEX/POSEIDON data.
The early maximum likelihood estimation model of audiovisual integration in speech perception.
Andersen, Tobias S
2015-05-01
Speech perception is facilitated by seeing the articulatory mouth movements of the talker. This is due to perceptual audiovisual integration, which also causes the McGurk-MacDonald illusion, and for which a comprehensive computational account is still lacking. Decades of research have largely focused on the fuzzy logical model of perception (FLMP), which provides excellent fits to experimental observations but also has been criticized for being too flexible, post hoc and difficult to interpret. The current study introduces the early maximum likelihood estimation (MLE) model of audiovisual integration to speech perception along with three model variations. In early MLE, integration is based on a continuous internal representation before categorization, which can make the model more parsimonious by imposing constraints that reflect experimental designs. The study also shows that cross-validation can evaluate models of audiovisual integration based on typical data sets taking both goodness-of-fit and model flexibility into account. All models were tested on a published data set previously used for testing the FLMP. Cross-validation favored the early MLE while more conventional error measures favored more complex models. This difference between conventional error measures and cross-validation was found to be indicative of over-fitting in more complex models such as the FLMP.
Dynamic Method for Identifying Collected Sample Mass
NASA Technical Reports Server (NTRS)
Carson, John
2008-01-01
G-Sample is designed for sample collection missions to identify the presence and quantity of sample material gathered by spacecraft equipped with end effectors. The software method uses a maximum-likelihood estimator to identify the collected sample's mass based on onboard force-sensor measurements, thruster firings, and a dynamics model of the spacecraft. This makes sample mass identification a computation rather than a process requiring additional hardware. Simulation examples of G-Sample are provided for spacecraft model configurations with a sample collection device mounted on the end of an extended boom. In the absence of thrust knowledge errors, the results indicate that G-Sample can identify the amount of collected sample mass to within 10 grams (with 95-percent confidence) by using a force sensor with a noise and quantization floor of 50 micrometers. These results hold even in the presence of realistic parametric uncertainty in actual spacecraft inertia, center-of-mass offset, and first flexibility modes. Thrust profile knowledge is shown to be a dominant sensitivity for G-Sample, entering in a nearly one-to-one relationship with the final mass estimation error. This means thrust profiles should be well characterized with onboard accelerometers prior to sample collection. An overall sample-mass estimation error budget has been developed to approximate the effect of model uncertainty, sensor noise, data rate, and thrust profile error on the expected estimate of collected sample mass.
NASA Astrophysics Data System (ADS)
Winiarek, Victor; Bocquet, Marc; Saunier, Olivier; Mathieu, Anne
2012-03-01
A major difficulty when inverting the source term of an atmospheric tracer dispersion problem is the estimation of the prior errors: those of the atmospheric transport model, those ascribed to the representativity of the measurements, those that are instrumental, and those attached to the prior knowledge on the variables one seeks to retrieve. In the case of an accidental release of pollutant, the reconstructed source is sensitive to these assumptions. This sensitivity makes the quality of the retrieval dependent on the methods used to model and estimate the prior errors of the inverse modeling scheme. We propose to use an estimation method for the errors' amplitude based on the maximum likelihood principle. Under semi-Gaussian assumptions, it takes into account, without approximation, the positivity assumption on the source. We apply the method to the estimation of the Fukushima Daiichi source term using activity concentrations in the air. The results are compared to an L-curve estimation technique and to Desroziers's scheme. The total reconstructed activities significantly depend on the chosen method. Because of the poor observability of the Fukushima Daiichi emissions, these methods provide lower bounds for cesium-137 and iodine-131 reconstructed activities. These lower bound estimates, 1.2 × 1016 Bq for cesium-137, with an estimated standard deviation range of 15%-20%, and 1.9 - 3.8 × 1017 Bq for iodine-131, with an estimated standard deviation range of 5%-10%, are of the same order of magnitude as those provided by the Japanese Nuclear and Industrial Safety Agency and about 5 to 10 times less than the Chernobyl atmospheric releases.
NASA Astrophysics Data System (ADS)
Khaleghi, Mohammad Reza; Varvani, Javad
2018-02-01
Complex and variable nature of the river sediment yield caused many problems in estimating the long-term sediment yield and problems input into the reservoirs. Sediment Rating Curves (SRCs) are generally used to estimate the suspended sediment load of the rivers and drainage watersheds. Since the regression equations of the SRCs are obtained by logarithmic retransformation and have a little independent variable in this equation, they also overestimate or underestimate the true sediment load of the rivers. To evaluate the bias correction factors in Kalshor and Kashafroud watersheds, seven hydrometric stations of this region with suitable upstream watershed and spatial distribution were selected. Investigation of the accuracy index (ratio of estimated sediment yield to observed sediment yield) and the precision index of different bias correction factors of FAO, Quasi-Maximum Likelihood Estimator (QMLE), Smearing, and Minimum-Variance Unbiased Estimator (MVUE) with LSD test showed that FAO coefficient increases the estimated error in all of the stations. Application of MVUE in linear and mean load rating curves has not statistically meaningful effects. QMLE and smearing factors increased the estimated error in mean load rating curve, but that does not have any effect on linear rating curve estimation.
Comparison of estimators of standard deviation for hydrologic time series
Tasker, Gary D.; Gilroy, Edward J.
1982-01-01
Unbiasing factors as a function of serial correlation, ρ, and sample size, n for the sample standard deviation of a lag one autoregressive model were generated by random number simulation. Monte Carlo experiments were used to compare the performance of several alternative methods for estimating the standard deviation σ of a lag one autoregressive model in terms of bias, root mean square error, probability of underestimation, and expected opportunity design loss. Three methods provided estimates of σ which were much less biased but had greater mean square errors than the usual estimate of σ: s = (1/(n - 1) ∑ (xi −x¯)2)½. The three methods may be briefly characterized as (1) a method using a maximum likelihood estimate of the unbiasing factor, (2) a method using an empirical Bayes estimate of the unbiasing factor, and (3) a robust nonparametric estimate of σ suggested by Quenouille. Because s tends to underestimate σ, its use as an estimate of a model parameter results in a tendency to underdesign. If underdesign losses are considered more serious than overdesign losses, then the choice of one of the less biased methods may be wise.
Chou, C P; Bentler, P M; Satorra, A
1991-11-01
Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.
Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.
Sztepanacz, Jacqueline L; Blows, Mark W
2017-07-01
The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdispersion of the leading eigenvalues of sample covariance matrices have been demonstrated to conform to the Tracy-Widom (TW) distribution. Here we show that genetic eigenvalues estimated using restricted maximum likelihood (REML) in a multivariate random effects model with an unconstrained genetic covariance structure will also conform to the TW distribution after empirical scaling and centering. However, where estimation procedures using either REML or MCMC impose boundary constraints, the resulting genetic eigenvalues tend not be TW distributed. We show how using confidence intervals from sampling distributions of genetic eigenvalues without reference to the TW distribution is insufficient protection against mistaking sampling error as genetic variance, particularly when eigenvalues are small. By scaling such sampling distributions to the appropriate TW distribution, the critical value of the TW statistic can be used to determine if the magnitude of a genetic eigenvalue exceeds the sampling error for each eigenvalue in the spectral distribution of a given genetic covariance matrix. Copyright © 2017 by the Genetics Society of America.
Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats.
Fungtammasan, Arkarachai; Tomaszkiewicz, Marta; Campos-Sánchez, Rebeca; Eckert, Kristin A; DeGiorgio, Michael; Makova, Kateryna D
2016-10-01
Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA-DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Shimura, Masashi; Maruo, Kazushi; Gosho, Masahiko
2018-04-23
Two-stage designs are widely used to determine whether a clinical trial should be terminated early. In such trials, a maximum likelihood estimate is often adopted to describe the difference in efficacy between the experimental and reference treatments; however, this method is known to display conditional bias. To reduce such bias, a conditional mean-adjusted estimator (CMAE) has been proposed, although the remaining bias may be nonnegligible when a trial is stopped for efficacy at the interim analysis. We propose a new estimator for adjusting the conditional bias of the treatment effect by extending the idea of the CMAE. This estimator is calculated by weighting the maximum likelihood estimate obtained at the interim analysis and the effect size prespecified when calculating the sample size. We evaluate the performance of the proposed estimator through analytical and simulation studies in various settings in which a trial is stopped for efficacy or futility at the interim analysis. We find that the conditional bias of the proposed estimator is smaller than that of the CMAE when the information time at the interim analysis is small. In addition, the mean-squared error of the proposed estimator is also smaller than that of the CMAE. In conclusion, we recommend the use of the proposed estimator for trials that are terminated early for efficacy or futility. Copyright © 2018 John Wiley & Sons, Ltd.
Wang, Chaolong; Schroeder, Kari B.; Rosenberg, Noah A.
2012-01-01
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy–Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology. PMID:22851645
Corrected score estimation in the proportional hazards model with misclassified discrete covariates
Zucker, David M.; Spiegelman, Donna
2013-01-01
SUMMARY We consider Cox proportional hazards regression when the covariate vector includes error-prone discrete covariates along with error-free covariates, which may be discrete or continuous. The misclassification in the discrete error-prone covariates is allowed to be of any specified form. Building on the work of Nakamura and his colleagues, we present a corrected score method for this setting. The method can handle all three major study designs (internal validation design, external validation design, and replicate measures design), both functional and structural error models, and time-dependent covariates satisfying a certain ‘localized error’ condition. We derive the asymptotic properties of the method and indicate how to adjust the covariance matrix of the regression coefficient estimates to account for estimation of the misclassification matrix. We present the results of a finite-sample simulation study under Weibull survival with a single binary covariate having known misclassification rates. The performance of the method described here was similar to that of related methods we have examined in previous works. Specifically, our new estimator performed as well as or, in a few cases, better than the full Weibull maximum likelihood estimator. We also present simulation results for our method for the case where the misclassification probabilities are estimated from an external replicate measures study. Our method generally performed well in these simulations. The new estimator has a broader range of applicability than many other estimators proposed in the literature, including those described in our own earlier work, in that it can handle time-dependent covariates with an arbitrary misclassification structure. We illustrate the method on data from a study of the relationship between dietary calcium intake and distal colon cancer. PMID:18219700
NASA Astrophysics Data System (ADS)
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (< 30 m) controlled methane release experiments using a specialized vehicle mounted with fast response methane analyzers and a GPS unit. Methane concentrations were measured at two different heights along mobile traversals downwind of the sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
Kimura, Akatsuki; Celani, Antonio; Nagao, Hiromichi; Stasevich, Timothy; Nakamura, Kazuyuki
2015-01-01
Construction of quantitative models is a primary goal of quantitative biology, which aims to understand cellular and organismal phenomena in a quantitative manner. In this article, we introduce optimization procedures to search for parameters in a quantitative model that can reproduce experimental data. The aim of optimization is to minimize the sum of squared errors (SSE) in a prediction or to maximize likelihood. A (local) maximum of likelihood or (local) minimum of the SSE can efficiently be identified using gradient approaches. Addition of a stochastic process enables us to identify the global maximum/minimum without becoming trapped in local maxima/minima. Sampling approaches take advantage of increasing computational power to test numerous sets of parameters in order to determine the optimum set. By combining Bayesian inference with gradient or sampling approaches, we can estimate both the optimum parameters and the form of the likelihood function related to the parameters. Finally, we introduce four examples of research that utilize parameter optimization to obtain biological insights from quantified data: transcriptional regulation, bacterial chemotaxis, morphogenesis, and cell cycle regulation. With practical knowledge of parameter optimization, cell and developmental biologists can develop realistic models that reproduce their observations and thus, obtain mechanistic insights into phenomena of interest.
A Bayesian model for estimating multi-state disease progression.
Shen, Shiwen; Han, Simon X; Petousis, Panayiotis; Weiss, Robert E; Meng, Frank; Bui, Alex A T; Hsu, William
2017-02-01
A growing number of individuals who are considered at high risk of cancer are now routinely undergoing population screening. However, noted harms such as radiation exposure, overdiagnosis, and overtreatment underscore the need for better temporal models that predict who should be screened and at what frequency. The mean sojourn time (MST), an average duration period when a tumor can be detected by imaging but with no observable clinical symptoms, is a critical variable for formulating screening policy. Estimation of MST has been long studied using continuous Markov model (CMM) with Maximum likelihood estimation (MLE). However, a lot of traditional methods assume no observation error of the imaging data, which is unlikely and can bias the estimation of the MST. In addition, the MLE may not be stably estimated when data is sparse. Addressing these shortcomings, we present a probabilistic modeling approach for periodic cancer screening data. We first model the cancer state transition using a three state CMM model, while simultaneously considering observation error. We then jointly estimate the MST and observation error within a Bayesian framework. We also consider the inclusion of covariates to estimate individualized rates of disease progression. Our approach is demonstrated on participants who underwent chest x-ray screening in the National Lung Screening Trial (NLST) and validated using posterior predictive p-values and Pearson's chi-square test. Our model demonstrates more accurate and sensible estimates of MST in comparison to MLE. Copyright © 2016 Elsevier Ltd. All rights reserved.
Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond.
Huisman, Jisca
2017-09-01
Data on hundreds or thousands of single nucleotide polymorphisms (SNPs) provide detailed information about the relationships between individuals, but currently few tools can turn this information into a multigenerational pedigree. I present the r package sequoia, which assigns parents, clusters half-siblings sharing an unsampled parent and assigns grandparents to half-sibships. Assignments are made after consideration of the likelihoods of all possible first-, second- and third-degree relationships between the focal individuals, as well as the traditional alternative of being unrelated. This careful exploration of the local likelihood surface is implemented in a fast, heuristic hill-climbing algorithm. Distinction between the various categories of second-degree relatives is possible when likelihoods are calculated conditional on at least one parent of each focal individual. Performance was tested on simulated data sets with realistic genotyping error rate and missingness, based on three different large pedigrees (N = 1000-2000). This included a complex pedigree with overlapping generations, occasional close inbreeding and some unknown birth years. Parentage assignment was highly accurate down to about 100 independent SNPs (error rate <0.1%) and fast (<1 min) as most pairs can be excluded from being parent-offspring based on opposite homozygosity. For full pedigree reconstruction, 40% of parents were assumed nongenotyped. Reconstruction resulted in low error rates (<0.3%), high assignment rates (>99%) in limited computation time (typically <1 h) when at least 200 independent SNPs were used. In three empirical data sets, relatedness estimated from the inferred pedigree was strongly correlated to genomic relatedness. © 2017 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
CubeSat mission design software tool for risk estimating relationships
NASA Astrophysics Data System (ADS)
Gamble, Katharine Brumbaugh; Lightsey, E. Glenn
2014-09-01
In an effort to make the CubeSat risk estimation and management process more scientific, a software tool has been created that enables mission designers to estimate mission risks. CubeSat mission designers are able to input mission characteristics, such as form factor, mass, development cycle, and launch information, in order to determine the mission risk root causes which historically present the highest risk for their mission. Historical data was collected from the CubeSat community and analyzed to provide a statistical background to characterize these Risk Estimating Relationships (RERs). This paper develops and validates the mathematical model based on the same cost estimating relationship methodology used by the Unmanned Spacecraft Cost Model (USCM) and the Small Satellite Cost Model (SSCM). The RER development uses general error regression models to determine the best fit relationship between root cause consequence and likelihood values and the input factors of interest. These root causes are combined into seven overall CubeSat mission risks which are then graphed on the industry-standard 5×5 Likelihood-Consequence (L-C) chart to help mission designers quickly identify areas of concern within their mission. This paper is the first to document not only the creation of a historical database of CubeSat mission risks, but, more importantly, the scientific representation of Risk Estimating Relationships.
NASA Astrophysics Data System (ADS)
Debski, Wojciech
2015-06-01
The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analysed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. Although estimating of the earthquake foci location is relatively simple, a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling and a priori uncertainties. In this paper, we addressed this task when statistics of observational and/or modelling errors are unknown. This common situation requires introduction of a priori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland, we propose an approach based on an analysis of Shanon's entropy calculated for the a posteriori distribution. We show that this meta-characteristic of the a posteriori distribution carries some information on uncertainties of the solution found.
Estimation of parameters of dose volume models and their confidence limits
NASA Astrophysics Data System (ADS)
van Luijk, P.; Delvigne, T. C.; Schilstra, C.; Schippers, J. M.
2003-07-01
Predictions of the normal-tissue complication probability (NTCP) for the ranking of treatment plans are based on fits of dose-volume models to clinical and/or experimental data. In the literature several different fit methods are used. In this work frequently used methods and techniques to fit NTCP models to dose response data for establishing dose-volume effects, are discussed. The techniques are tested for their usability with dose-volume data and NTCP models. Different methods to estimate the confidence intervals of the model parameters are part of this study. From a critical-volume (CV) model with biologically realistic parameters a primary dataset was generated, serving as the reference for this study and describable by the NTCP model. The CV model was fitted to this dataset. From the resulting parameters and the CV model, 1000 secondary datasets were generated by Monte Carlo simulation. All secondary datasets were fitted to obtain 1000 parameter sets of the CV model. Thus the 'real' spread in fit results due to statistical spreading in the data is obtained and has been compared with estimates of the confidence intervals obtained by different methods applied to the primary dataset. The confidence limits of the parameters of one dataset were estimated using the methods, employing the covariance matrix, the jackknife method and directly from the likelihood landscape. These results were compared with the spread of the parameters, obtained from the secondary parameter sets. For the estimation of confidence intervals on NTCP predictions, three methods were tested. Firstly, propagation of errors using the covariance matrix was used. Secondly, the meaning of the width of a bundle of curves that resulted from parameters that were within the one standard deviation region in the likelihood space was investigated. Thirdly, many parameter sets and their likelihood were used to create a likelihood-weighted probability distribution of the NTCP. It is concluded that for the type of dose response data used here, only a full likelihood analysis will produce reliable results. The often-used approximations, such as the usage of the covariance matrix, produce inconsistent confidence limits on both the parameter sets and the resulting NTCP values.
Location error uncertainties - an advanced using of probabilistic inverse theory
NASA Astrophysics Data System (ADS)
Debski, Wojciech
2016-04-01
The spatial location of sources of seismic waves is one of the first tasks when transient waves from natural (uncontrolled) sources are analyzed in many branches of physics, including seismology, oceanology, to name a few. Source activity and its spatial variability in time, the geometry of recording network, the complexity and heterogeneity of wave velocity distribution are all factors influencing the performance of location algorithms and accuracy of the achieved results. While estimating of the earthquake foci location is relatively simple a quantitative estimation of the location accuracy is really a challenging task even if the probabilistic inverse method is used because it requires knowledge of statistics of observational, modelling, and apriori uncertainties. In this presentation we addressed this task when statistics of observational and/or modeling errors are unknown. This common situation requires introduction of apriori constraints on the likelihood (misfit) function which significantly influence the estimated errors. Based on the results of an analysis of 120 seismic events from the Rudna copper mine operating in southwestern Poland we illustrate an approach based on an analysis of Shanon's entropy calculated for the aposteriori distribution. We show that this meta-characteristic of the aposteriori distribution carries some information on uncertainties of the solution found.
Rogers, Paul; Fisk, John E; Lowrie, Emma
2017-11-01
The present study examines the extent to which stronger belief in either extrasensory perception, psychokinesis or life-after-death is associated with a proneness to making conjunction errors (CEs). One hundred and sixty members of the UK public read eight hypothetical scenarios and for each estimated the likelihood that two constituent events alone plus their conjunction would occur. The impact of paranormal belief plus constituents' conditional relatedness type, estimates of the subjectively less likely and more likely constituents plus relevant interaction terms tested via three Generalized Linear Mixed Models. General qualification levels were controlled for. As expected, stronger PK beliefs and depiction of a positively conditionally related (verses conditionally unrelated) constituent pairs predicted higher CE generation. ESP and LAD beliefs had no impact with, surprisingly, higher estimates of the less likely constituent predicting fewer - not more - CEs. Theoretical implications, methodological issues and ideas for future research are discussed. Copyright © 2017 Elsevier Inc. All rights reserved.
Inventory and mapping of flood inundation using interactive digital image analysis techniques
Rohde, Wayne G.; Nelson, Charles A.; Taranik, J.V.
1979-01-01
LANDSAT digital data and color infra-red photographs were used in a multiphase sampling scheme to estimate the area of agricultural land affected by a flood. The LANDSAT data were classified with a maximum likelihood algorithm. Stratification of the LANDSAT data, prior to classification, greatly reduced misclassification errors. The classification results were used to prepare a map overlay showing the areal extent of flooding. These data also provided statistics required to estimate sample size in a two phase sampling scheme, and provided quick, accurate estimates of areas flooded for the first phase. The measurements made in the second phase, based on ground data and photo-interpretation, were used with two phase sampling statistics to estimate the area of agricultural land affected by flooding These results show that LANDSAT digital data can be used to prepare map overlays showing the extent of flooding on agricultural land and, with two phase sampling procedures, can provide acreage estimates with sampling errors of about 5 percent. This procedure provides a technique for rapidly assessing the areal extent of flood conditions on agricultural land and would provide a basis for designing a sampling framework to estimate the impact of flooding on crop production.
Absolute magnitude calibration using trigonometric parallax - Incomplete, spectroscopic samples
NASA Technical Reports Server (NTRS)
Ratnatunga, Kavan U.; Casertano, Stefano
1991-01-01
A new numerical algorithm is used to calibrate the absolute magnitude of spectroscopically selected stars from their observed trigonometric parallax. This procedure, based on maximum-likelihood estimation, can retrieve unbiased estimates of the intrinsic absolute magnitude and its dispersion even from incomplete samples suffering from selection biases in apparent magnitude and color. It can also make full use of low accuracy and negative parallaxes and incorporate censorship on reported parallax values. Accurate error estimates are derived for each of the fitted parameters. The algorithm allows an a posteriori check of whether the fitted model gives a good representation of the observations. The procedure is described in general and applied to both real and simulated data.
Degradation data analysis based on a generalized Wiener process subject to measurement error
NASA Astrophysics Data System (ADS)
Li, Junxing; Wang, Zhihua; Zhang, Yongbo; Fu, Huimin; Liu, Chengrui; Krishnaswamy, Sridhar
2017-09-01
Wiener processes have received considerable attention in degradation modeling over the last two decades. In this paper, we propose a generalized Wiener process degradation model that takes unit-to-unit variation, time-correlated structure and measurement error into considerations simultaneously. The constructed methodology subsumes a series of models studied in the literature as limiting cases. A simple method is given to determine the transformed time scale forms of the Wiener process degradation model. Then model parameters can be estimated based on a maximum likelihood estimation (MLE) method. The cumulative distribution function (CDF) and the probability distribution function (PDF) of the Wiener process with measurement errors are given based on the concept of the first hitting time (FHT). The percentiles of performance degradation (PD) and failure time distribution (FTD) are also obtained. Finally, a comprehensive simulation study is accomplished to demonstrate the necessity of incorporating measurement errors in the degradation model and the efficiency of the proposed model. Two illustrative real applications involving the degradation of carbon-film resistors and the wear of sliding metal are given. The comparative results show that the constructed approach can derive a reasonable result and an enhanced inference precision.
History, Epidemic Evolution, and Model Burn-In for a Network of Annual Invasion: Soybean Rust.
Sanatkar, M R; Scoglio, C; Natarajan, B; Isard, S A; Garrett, K A
2015-07-01
Ecological history may be an important driver of epidemics and disease emergence. We evaluated the role of history and two related concepts, the evolution of epidemics and the burn-in period required for fitting a model to epidemic observations, for the U.S. soybean rust epidemic (caused by Phakopsora pachyrhizi). This disease allows evaluation of replicate epidemics because the pathogen reinvades the United States each year. We used a new maximum likelihood estimation approach for fitting the network model based on observed U.S. epidemics. We evaluated the model burn-in period by comparing model fit based on each combination of other years of observation. When the miss error rates were weighted by 0.9 and false alarm error rates by 0.1, the mean error rate did decline, for most years, as more years were used to construct models. Models based on observations in years closer in time to the season being estimated gave lower miss error rates for later epidemic years. The weighted mean error rate was lower in backcasting than in forecasting, reflecting how the epidemic had evolved. Ongoing epidemic evolution, and potential model failure, can occur because of changes in climate, host resistance and spatial patterns, or pathogen evolution.
NASA Astrophysics Data System (ADS)
Degaudenzi, Riccardo; Vanghi, Vieri
1994-02-01
In all-digital Trellis-Coded 8PSK (TC-8PSK) demodulator well suited for VLSI implementation, including maximum likelihood estimation decision-directed (MLE-DD) carrier phase and clock timing recovery, is introduced and analyzed. By simply removing the trellis decoder the demodulator can efficiently cope with uncoded 8PSK signals. The proposed MLE-DD synchronization algorithm requires one sample for the phase and two samples per symbol for the timing loop. The joint phase and timing discriminator characteristics are analytically derived and numerical results checked by means of computer simulations. An approximated expression for steady-state carrier phase and clock timing mean square error has been derived and successfully checked with simulation findings. Synchronizer deviation from the Cramer Rao bound is also discussed. Mean acquisition time for the digital synchronizer has also been computed and checked, using the Monte Carlo simulation technique. Finally, TC-8PSK digital demodulator performance in terms of bit error rate and mean time to lose lock, including digital interpolators and synchronization loops, is presented.
NASA Astrophysics Data System (ADS)
Behnabian, Behzad; Mashhadi Hossainali, Masoud; Malekzadeh, Ahad
2018-02-01
The cross-validation technique is a popular method to assess and improve the quality of prediction by least squares collocation (LSC). We present a formula for direct estimation of the vector of cross-validation errors (CVEs) in LSC which is much faster than element-wise CVE computation. We show that a quadratic form of CVEs follows Chi-squared distribution. Furthermore, a posteriori noise variance factor is derived by the quadratic form of CVEs. In order to detect blunders in the observations, estimated standardized CVE is proposed as the test statistic which can be applied when noise variances are known or unknown. We use LSC together with the methods proposed in this research for interpolation of crustal subsidence in the northern coast of the Gulf of Mexico. The results show that after detection and removing outliers, the root mean square (RMS) of CVEs and estimated noise standard deviation are reduced about 51 and 59%, respectively. In addition, RMS of LSC prediction error at data points and RMS of estimated noise of observations are decreased by 39 and 67%, respectively. However, RMS of LSC prediction error on a regular grid of interpolation points covering the area is only reduced about 4% which is a consequence of sparse distribution of data points for this case study. The influence of gross errors on LSC prediction results is also investigated by lower cutoff CVEs. It is indicated that after elimination of outliers, RMS of this type of errors is also reduced by 19.5% for a 5 km radius of vicinity. We propose a method using standardized CVEs for classification of dataset into three groups with presumed different noise variances. The noise variance components for each of the groups are estimated using restricted maximum-likelihood method via Fisher scoring technique. Finally, LSC assessment measures were computed for the estimated heterogeneous noise variance model and compared with those of the homogeneous model. The advantage of the proposed method is the reduction in estimated noise levels for those groups with the fewer number of noisy data points.
Unifying distance-based goodness-of-fit indicators for hydrologic model assessment
NASA Astrophysics Data System (ADS)
Cheng, Qinbo; Reinhardt-Imjela, Christian; Chen, Xi; Schulte, Achim
2014-05-01
The goodness-of-fit indicator, i.e. efficiency criterion, is very important for model calibration. However, recently the knowledge about the goodness-of-fit indicators is all empirical and lacks a theoretical support. Based on the likelihood theory, a unified distance-based goodness-of-fit indicator termed BC-GED model is proposed, which uses the Box-Cox (BC) transformation to remove the heteroscedasticity of model errors and the generalized error distribution (GED) with zero-mean to fit the distribution of model errors after BC. The BC-GED model can unify all recent distance-based goodness-of-fit indicators, and reveals the mean square error (MSE) and the mean absolute error (MAE) that are widely used goodness-of-fit indicators imply statistic assumptions that the model errors follow the Gaussian distribution and the Laplace distribution with zero-mean, respectively. The empirical knowledge about goodness-of-fit indicators can be also easily interpreted by BC-GED model, e.g. the sensitivity to high flow of the goodness-of-fit indicators with large power of model errors results from the low probability of large model error in the assumed distribution of these indicators. In order to assess the effect of the parameters (i.e. the BC transformation parameter λ and the GED kurtosis coefficient β also termed the power of model errors) of BC-GED model on hydrologic model calibration, six cases of BC-GED model were applied in Baocun watershed (East China) with SWAT-WB-VSA model. Comparison of the inferred model parameters and model simulation results among the six indicators demonstrates these indicators can be clearly separated two classes by the GED kurtosis β: β >1 and β ≤ 1. SWAT-WB-VSA calibrated by the class β >1 of distance-based goodness-of-fit indicators captures high flow very well and mimics the baseflow very badly, but it calibrated by the class β ≤ 1 mimics the baseflow very well, because first the larger value of β, the greater emphasis is put on high flow and second the derivative of GED probability density function at zero is zero as β >1, but discontinuous as β ≤ 1, and even infinity as β < 1 with which the maximum likelihood estimation can guarantee the model errors approach zero as well as possible. The BC-GED that estimates the parameters (i.e. λ and β) of BC-GED model as well as hydrologic model parameters is the best distance-based goodness-of-fit indicator because not only the model validation using groundwater levels is very good, but also the model errors fulfill the statistic assumption best. However, in some cases of model calibration with a few observations e.g. calibration of single-event model, for avoiding estimation of the parameters of BC-GED model the MAE i.e. the boundary indicator (β = 1) of the two classes, can replace the BC-GED, because the model validation of MAE is best.
Elashoff, Robert M.; Li, Gang; Li, Ning
2009-01-01
Summary In this article we study a joint model for longitudinal measurements and competing risks survival data. Our joint model provides a flexible approach to handle possible nonignorable missing data in the longitudinal measurements due to dropout. It is also an extension of previous joint models with a single failure type, offering a possible way to model informatively censored events as a competing risk. Our model consists of a linear mixed effects submodel for the longitudinal outcome and a proportional cause-specific hazards frailty submodel (Prentice et al., 1978, Biometrics 34, 541-554) for the competing risks survival data, linked together by some latent random effects. We propose to obtain the maximum likelihood estimates of the parameters by an expectation maximization (EM) algorithm and estimate their standard errors using a profile likelihood method. The developed method works well in our simulation studies and is applied to a clinical trial for the scleroderma lung disease. PMID:18162112
Efficient logistic regression designs under an imperfect population identifier.
Albert, Paul S; Liu, Aiyi; Nansel, Tonja
2014-03-01
Motivated by actual study designs, this article considers efficient logistic regression designs where the population is identified with a binary test that is subject to diagnostic error. We consider the case where the imperfect test is obtained on all participants, while the gold standard test is measured on a small chosen subsample. Under maximum-likelihood estimation, we evaluate the optimal design in terms of sample selection as well as verification. We show that there may be substantial efficiency gains by choosing a small percentage of individuals who test negative on the imperfect test for inclusion in the sample (e.g., verifying 90% test-positive cases). We also show that a two-stage design may be a good practical alternative to a fixed design in some situations. Under optimal and nearly optimal designs, we compare maximum-likelihood and semi-parametric efficient estimators under correct and misspecified models with simulations. The methodology is illustrated with an analysis from a diabetes behavioral intervention trial. © 2013, The International Biometric Society.
Onuk, A. Emre; Akcakaya, Murat; Bardhan, Jaydeep P.; Erdogmus, Deniz; Brooks, Dana H.; Makowski, Lee
2015-01-01
In this paper, we describe a model for maximum likelihood estimation (MLE) of the relative abundances of different conformations of a protein in a heterogeneous mixture from small angle X-ray scattering (SAXS) intensities. To consider cases where the solution includes intermediate or unknown conformations, we develop a subset selection method based on k-means clustering and the Cramér-Rao bound on the mixture coefficient estimation error to find a sparse basis set that represents the space spanned by the measured SAXS intensities of the known conformations of a protein. Then, using the selected basis set and the assumptions on the model for the intensity measurements, we show that the MLE model can be expressed as a constrained convex optimization problem. Employing the adenylate kinase (ADK) protein and its known conformations as an example, and using Monte Carlo simulations, we demonstrate the performance of the proposed estimation scheme. Here, although we use 45 crystallographically determined experimental structures and we could generate many more using, for instance, molecular dynamics calculations, the clustering technique indicates that the data cannot support the determination of relative abundances for more than 5 conformations. The estimation of this maximum number of conformations is intrinsic to the methodology we have used here. PMID:26924916
NASA Astrophysics Data System (ADS)
Li, Xinya; Deng, Zhiqun Daniel; Rauchenstein, Lynn T.; Carlson, Thomas J.
2016-04-01
Locating the position of fixed or mobile sources (i.e., transmitters) based on measurements obtained from sensors (i.e., receivers) is an important research area that is attracting much interest. In this paper, we review several representative localization algorithms that use time of arrivals (TOAs) and time difference of arrivals (TDOAs) to achieve high signal source position estimation accuracy when a transmitter is in the line-of-sight of a receiver. Circular (TOA) and hyperbolic (TDOA) position estimation approaches both use nonlinear equations that relate the known locations of receivers and unknown locations of transmitters. Estimation of the location of transmitters using the standard nonlinear equations may not be very accurate because of receiver location errors, receiver measurement errors, and computational efficiency challenges that result in high computational burdens. Least squares and maximum likelihood based algorithms have become the most popular computational approaches to transmitter location estimation. In this paper, we summarize the computational characteristics and position estimation accuracies of various positioning algorithms. By improving methods for estimating the time-of-arrival of transmissions at receivers and transmitter location estimation algorithms, transmitter location estimation may be applied across a range of applications and technologies such as radar, sonar, the Global Positioning System, wireless sensor networks, underwater animal tracking, mobile communications, and multimedia.
2016-01-01
Background It is often thought that random measurement error has a minor effect upon the results of an epidemiological survey. Theoretically, errors of measurement should always increase the spread of a distribution. Defining an illness by having a measurement outside an established healthy range will lead to an inflated prevalence of that condition if there are measurement errors. Methods and results A Monte Carlo simulation was conducted of anthropometric assessment of children with malnutrition. Random errors of increasing magnitude were imposed upon the populations and showed that there was an increase in the standard deviation with each of the errors that became exponentially greater with the magnitude of the error. The potential magnitude of the resulting error of reported prevalence of malnutrition were compared with published international data and found to be of sufficient magnitude to make a number of surveys and the numerous reports and analyses that used these data unreliable. Conclusions The effect of random error in public health surveys and the data upon which diagnostic cut-off points are derived to define “health” has been underestimated. Even quite modest random errors can more than double the reported prevalence of conditions such as malnutrition. Increasing sample size does not address this problem, and may even result in less accurate estimates. More attention needs to be paid to the selection, calibration and maintenance of instruments, measurer selection, training & supervision, routine estimation of the likely magnitude of errors using standardization tests, use of statistical likelihood of error to exclude data from analysis and full reporting of these procedures in order to judge the reliability of survey reports. PMID:28030627
Espino-Hernandez, Gabriela; Gustafson, Paul; Burstyn, Igor
2011-05-14
In epidemiological studies explanatory variables are frequently subject to measurement error. The aim of this paper is to develop a Bayesian method to correct for measurement error in multiple continuous exposures in individually matched case-control studies. This is a topic that has not been widely investigated. The new method is illustrated using data from an individually matched case-control study of the association between thyroid hormone levels during pregnancy and exposure to perfluorinated acids. The objective of the motivating study was to examine the risk of maternal hypothyroxinemia due to exposure to three perfluorinated acids measured on a continuous scale. Results from the proposed method are compared with those obtained from a naive analysis. Using a Bayesian approach, the developed method considers a classical measurement error model for the exposures, as well as the conditional logistic regression likelihood as the disease model, together with a random-effect exposure model. Proper and diffuse prior distributions are assigned, and results from a quality control experiment are used to estimate the perfluorinated acids' measurement error variability. As a result, posterior distributions and 95% credible intervals of the odds ratios are computed. A sensitivity analysis of method's performance in this particular application with different measurement error variability was performed. The proposed Bayesian method to correct for measurement error is feasible and can be implemented using statistical software. For the study on perfluorinated acids, a comparison of the inferences which are corrected for measurement error to those which ignore it indicates that little adjustment is manifested for the level of measurement error actually exhibited in the exposures. Nevertheless, a sensitivity analysis shows that more substantial adjustments arise if larger measurement errors are assumed. In individually matched case-control studies, the use of conditional logistic regression likelihood as a disease model in the presence of measurement error in multiple continuous exposures can be justified by having a random-effect exposure model. The proposed method can be successfully implemented in WinBUGS to correct individually matched case-control studies for several mismeasured continuous exposures under a classical measurement error model.
NASA Technical Reports Server (NTRS)
da Silva, Arlindo; Redder, Christopher
2010-01-01
MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5). The project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context. The characterization of uncertainty in reanalysis fields is a commonly requested feature by users of such data. While intercomparison with reference data sets is common practice for ascertaining the realism of the datasets, such studies typically are restricted to long term climatological statistics and seldom provide state dependent measures of the uncertainties involved. In principle, variational data assimilation algorithms have the ability of producing error estimates for the analysis variables (typically surface pressure, winds, temperature, moisture and ozone) consistent with the assumed background and observation error statistics. However, these "perceived error estimates" are expensive to obtain and are limited by the somewhat simplistic errors assumed in the algorithm. The observation minus forecast residuals (innovations) by-product of any assimilation system constitutes a powerful tool for estimating the systematic and random errors in the analysis fields. Unfortunately, such data is usually not readily available with reanalysis products, often requiring the tedious decoding of large datasets and not so-user friendly file formats. With MERRA we have introduced a gridded version of the observations/innovations used in the assimilation process, using the same grid and data formats as the regular datasets. Such dataset empowers the user with the ability of conveniently performing observing system related analysis and error estimates. The scope of this dataset will be briefly described. We will present a systematic analysis of MERRA innovation time series for the conventional observing system, including maximum-likelihood estimates of background and observation errors, as well as global bias estimates. Starting with the joint PDF of innovations and analysis increments at observation locations we propose a technique for diagnosing bias among the observing systems, and document how these contextual biases have evolved during the satellite era covered by MERRA.
NASA Astrophysics Data System (ADS)
da Silva, A.; Redder, C. R.
2010-12-01
MERRA is a NASA reanalysis for the satellite era using a major new version of the Goddard Earth Observing System Data Assimilation System Version 5 (GEOS-5). The Project focuses on historical analyses of the hydrological cycle on a broad range of weather and climate time scales and places the NASA EOS suite of observations in a climate context. The characterization of uncertainty in reanalysis fields is a commonly requested feature by users of such data. While intercomparison with reference data sets is common practice for ascertaining the realism of the datasets, such studies typically are restricted to long term climatological statistics and seldom provide state dependent measures of the uncertainties involved. In principle, variational data assimilation algorithms have the ability of producing error estimates for the analysis variables (typically surface pressure, winds, temperature, moisture and ozone) consistent with the assumed background and observation error statistics. However, these "perceived error estimates" are expensive to obtain and are limited by the somewhat simplistic errors assumed in the algorithm. The observation minus forecast residuals (innovations) by-product of any assimilation system constitutes a powerful tool for estimating the systematic and random errors in the analysis fields. Unfortunately, such data is usually not readily available with reanalysis products, often requiring the tedious decoding of large datasets and not so-user friendly file formats. With MERRA we have introduced a gridded version of the observations/innovations used in the assimilation process, using the same grid and data formats as the regular datasets. Such dataset empowers the user with the ability of conveniently performing observing system related analysis and error estimates. The scope of this dataset will be briefly described. We will present a systematic analysis of MERRA innovation time series for the conventional observing system, including maximum-likelihood estimates of background and observation errors, as well as global bias estimates. Starting with the joint PDF of innovations and analysis increments at observation locations we propose a technique for diagnosing bias among the observing systems, and document how these contextual biases have evolved during the satellite era covered by MERRA.
BAO from Angular Clustering: Optimization and Mitigation of Theoretical Systematics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crocce, M.; et al.
We study the theoretical systematics and optimize the methodology in Baryon Acoustic Oscillations (BAO) detections using the angular correlation function with tomographic bins. We calibrate and optimize the pipeline for the Dark Energy Survey Year 1 dataset using 1800 mocks. We compare the BAO fitting results obtained with three estimators: the Maximum Likelihood Estimator (MLE), Profile Likelihood, and Markov Chain Monte Carlo. The MLE method yields the least bias in the fit results (bias/spreadmore » $$\\sim 0.02$$) and the error bar derived is the closest to the Gaussian results (1% from 68% Gaussian expectation). When there is mismatch between the template and the data either due to incorrect fiducial cosmology or photo-$z$ error, the MLE again gives the least-biased results. The BAO angular shift that is estimated based on the sound horizon and the angular diameter distance agree with the numerical fit. Various analysis choices are further tested: the number of redshift bins, cross-correlations, and angular binning. We propose two methods to correct the mock covariance when the final sample properties are slightly different from those used to create the mock. We show that the sample changes can be accommodated with the help of the Gaussian covariance matrix or more effectively using the eigenmode expansion of the mock covariance. The eigenmode expansion is significantly less susceptible to statistical fluctuations relative to the direct measurements of the covariance matrix because the number of free parameters is substantially reduced [$p$ parameters versus $p(p+1)/2$ from direct measurement].« less
Accuracy of latent-variable estimation in Bayesian semi-supervised learning.
Yamazaki, Keisuke
2015-09-01
Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data, performs better when the model is well specified. Copyright © 2015 Elsevier Ltd. All rights reserved.
2013-01-01
Background Falls among the elderly are a major public health concern. Therefore, the possibility of a modeling technique which could better estimate fall probability is both timely and needed. Using biomedical, pharmacological and demographic variables as predictors, latent class analysis (LCA) is demonstrated as a tool for the prediction of falls among community dwelling elderly. Methods Using a retrospective data-set a two-step LCA modeling approach was employed. First, we looked for the optimal number of latent classes for the seven medical indicators, along with the patients’ prescription medication and three covariates (age, gender, and number of medications). Second, the appropriate latent class structure, with the covariates, were modeled on the distal outcome (fall/no fall). The default estimator was maximum likelihood with robust standard errors. The Pearson chi-square, likelihood ratio chi-square, BIC, Lo-Mendell-Rubin Adjusted Likelihood Ratio test and the bootstrap likelihood ratio test were used for model comparisons. Results A review of the model fit indices with covariates shows that a six-class solution was preferred. The predictive probability for latent classes ranged from 84% to 97%. Entropy, a measure of classification accuracy, was good at 90%. Specific prescription medications were found to strongly influence group membership. Conclusions In conclusion the LCA method was effective at finding relevant subgroups within a heterogenous at-risk population for falling. This study demonstrated that LCA offers researchers a valuable tool to model medical data. PMID:23705639
Genetic modelling of test day records in dairy sheep using orthogonal Legendre polynomials.
Kominakis, A; Volanis, M; Rogdakis, E
2001-03-01
Test day milk yields of three lactations in Sfakia sheep were analyzed fitting a random regression (RR) model, regressing on orthogonal polynomials of the stage of the lactation period, i.e. days in milk. Univariate (UV) and multivariate (MV) analyses were also performed for four stages of the lactation period, represented by average days in milk, i.e. 15, 45, 70 and 105 days, to compare estimates obtained from RR models with estimates from UV and MV analyses. The total number of test day records were 790, 1314 and 1041 obtained from 214, 342 and 303 ewes in the first, second and third lactation, respectively. Error variances and covariances between regression coefficients were estimated by restricted maximum likelihood. Models were compared using likelihood ratio tests (LRTs). Log likelihoods were not significantly reduced when the rank of the orthogonal Legendre polynomials (LPs) of lactation stage was reduced from 4 to 2 and homogenous variances for lactation stages within lactations were considered. Mean weighted heritability estimates with RR models were 0.19, 0.09 and 0.08 for first, second and third lactation, respectively. The respective estimates obtained from UV analyses were 0.14, 0.12 and 0.08, respectively. Mean permanent environmental variance, as a proportion of the total, was high at all stages and lactations ranging from 0.54 to 0.71. Within lactations, genetic and permanent environmental correlations between lactation stages were in the range from 0.36 to 0.99 and 0.76 to 0.99, respectively. Genetic parameters for additive genetic and permanent environmental effects obtained from RR models were different from those obtained from UV and MV analyses.
On Correlated-noise Analyses Applied to Exoplanet Light Curves
NASA Astrophysics Data System (ADS)
Cubillos, Patricio; Harrington, Joseph; Loredo, Thomas J.; Lust, Nate B.; Blecic, Jasmina; Stemm, Madison
2017-01-01
Time-correlated noise is a significant source of uncertainty when modeling exoplanet light-curve data. A correct assessment of correlated noise is fundamental to determine the true statistical significance of our findings. Here, we review three of the most widely used correlated-noise estimators in the exoplanet field, the time-averaging, residual-permutation, and wavelet-likelihood methods. We argue that the residual-permutation method is unsound in estimating the uncertainty of parameter estimates. We thus recommend to refrain from this method altogether. We characterize the behavior of the time averaging’s rms-versus-bin-size curves at bin sizes similar to the total observation duration, which may lead to underestimated uncertainties. For the wavelet-likelihood method, we note errors in the published equations and provide a list of corrections. We further assess the performance of these techniques by injecting and retrieving eclipse signals into synthetic and real Spitzer light curves, analyzing the results in terms of the relative-accuracy and coverage-fraction statistics. Both the time-averaging and wavelet-likelihood methods significantly improve the estimate of the eclipse depth over a white-noise analysis (a Markov-chain Monte Carlo exploration assuming uncorrelated noise). However, the corrections are not perfect when retrieving the eclipse depth from Spitzer data sets, these methods covered the true (injected) depth within the 68% credible region in only ˜45%-65% of the trials. Lastly, we present our open-source model-fitting tool, Multi-Core Markov-Chain Monte Carlo (MC3). This package uses Bayesian statistics to estimate the best-fitting values and the credible regions for the parameters for a (user-provided) model. MC3 is a Python/C code, available at https://github.com/pcubillos/MCcubed.
Aarts, Esther; Roelofs, Ardi; van Turennout, Miranda
2008-04-30
Previous studies have found no agreement on whether anticipatory activity in the anterior cingulate cortex (ACC) reflects upcoming conflict, error likelihood, or actual control adjustments. Using event-related functional magnetic resonance imaging, we investigated the nature of preparatory activity in the ACC. Informative cues told the participants whether an upcoming target would or would not involve conflict in a Stroop-like task. Uninformative cues provided no such information. Behavioral responses were faster after informative than after uninformative cues, indicating cue-based adjustments in control. ACC activity was larger after informative than uninformative cues, as would be expected if the ACC is involved in anticipatory control. Importantly, this activation in the ACC was observed for informative cues even when the information conveyed by the cue was that the upcoming target evokes no response conflict and has low error likelihood. This finding demonstrates that the ACC is involved in anticipatory control processes independent of upcoming response conflict or error likelihood. Moreover, the response of the ACC to the target stimuli was critically dependent on whether the cue was informative or not. ACC activity differed among target conditions after uninformative cues only, indicating ACC involvement in actual control adjustments. Together, these findings argue strongly for a role of the ACC in anticipatory control independent of anticipated conflict and error likelihood, and also show that such control can eliminate conflict-related ACC activity during target processing. Models of frontal cortex conflict-detection and conflict-resolution mechanisms require modification to include consideration of these anticipatory control properties of the ACC.
Brown, Joshua W.
2009-01-01
The error likelihood computational model of anterior cingulate cortex (ACC) (Brown & Braver, 2005) has successfully predicted error likelihood effects, risk prediction effects, and how individual differences in conflict and error likelihood effects vary with trait differences in risk aversion. The same computational model now makes a further prediction that apparent conflict effects in ACC may result in part from an increasing number of simultaneously active responses, regardless of whether or not the cued responses are mutually incompatible. In Experiment 1, the model prediction was tested with a modification of the Eriksen flanker task, in which some task conditions require two otherwise mutually incompatible responses to be generated simultaneously. In that case, the two response processes are no longer in conflict with each other. The results showed small but significant medial PFC effects in the incongruent vs. congruent contrast, despite the absence of response conflict, consistent with model predictions. This is the multiple response effect. Nonetheless, actual response conflict led to greater ACC activation, suggesting that conflict effects are specific to particular task contexts. In Experiment 2, results from a change signal task suggested that the context dependence of conflict signals does not depend on error likelihood effects. Instead, inputs to ACC may reflect complex and task specific representations of motor acts, such as bimanual responses. Overall, the results suggest the existence of a richer set of motor signals monitored by medial PFC and are consistent with distinct effects of multiple responses, conflict, and error likelihood in medial PFC. PMID:19375509
Applications of non-standard maximum likelihood techniques in energy and resource economics
NASA Astrophysics Data System (ADS)
Moeltner, Klaus
Two important types of non-standard maximum likelihood techniques, Simulated Maximum Likelihood (SML) and Pseudo-Maximum Likelihood (PML), have only recently found consideration in the applied economic literature. The objective of this thesis is to demonstrate how these methods can be successfully employed in the analysis of energy and resource models. Chapter I focuses on SML. It constitutes the first application of this technique in the field of energy economics. The framework is as follows: Surveys on the cost of power outages to commercial and industrial customers usually capture multiple observations on the dependent variable for a given firm. The resulting pooled data set is censored and exhibits cross-sectional heterogeneity. We propose a model that addresses these issues by allowing regression coefficients to vary randomly across respondents and by using the Geweke-Hajivassiliou-Keane simulator and Halton sequences to estimate high-order cumulative distribution terms. This adjustment requires the use of SML in the estimation process. Our framework allows for a more comprehensive analysis of outage costs than existing models, which rely on the assumptions of parameter constancy and cross-sectional homogeneity. Our results strongly reject both of these restrictions. The central topic of the second Chapter is the use of PML, a robust estimation technique, in count data analysis of visitor demand for a system of recreation sites. PML has been popular with researchers in this context, since it guards against many types of mis-specification errors. We demonstrate, however, that estimation results will generally be biased even if derived through PML if the recreation model is based on aggregate, or zonal data. To countervail this problem, we propose a zonal model of recreation that captures some of the underlying heterogeneity of individual visitors by incorporating distributional information on per-capita income into the aggregate demand function. This adjustment eliminates the unrealistic constraint of constant income across zonal residents, and thus reduces the risk of aggregation bias in estimated macro-parameters. The corrected aggregate specification reinstates the applicability of PML. It also increases model efficiency, and allows-for the generation of welfare estimates for population subgroups.
Sethi, Suresh; Linden, Daniel; Wenburg, John; Lewis, Cara; Lemons, Patrick R.; Fuller, Angela K.; Hare, Matthew P.
2016-01-01
Error-tolerant likelihood-based match calling presents a promising technique to accurately identify recapture events in genetic mark–recapture studies by combining probabilities of latent genotypes and probabilities of observed genotypes, which may contain genotyping errors. Combined with clustering algorithms to group samples into sets of recaptures based upon pairwise match calls, these tools can be used to reconstruct accurate capture histories for mark–recapture modelling. Here, we assess the performance of a recently introduced error-tolerant likelihood-based match-calling model and sample clustering algorithm for genetic mark–recapture studies. We assessed both biallelic (i.e. single nucleotide polymorphisms; SNP) and multiallelic (i.e. microsatellite; MSAT) markers using a combination of simulation analyses and case study data on Pacific walrus (Odobenus rosmarus divergens) and fishers (Pekania pennanti). A novel two-stage clustering approach is demonstrated for genetic mark–recapture applications. First, repeat captures within a sampling occasion are identified. Subsequently, recaptures across sampling occasions are identified. The likelihood-based matching protocol performed well in simulation trials, demonstrating utility for use in a wide range of genetic mark–recapture studies. Moderately sized SNP (64+) and MSAT (10–15) panels produced accurate match calls for recaptures and accurate non-match calls for samples from closely related individuals in the face of low to moderate genotyping error. Furthermore, matching performance remained stable or increased as the number of genetic markers increased, genotyping error notwithstanding.
A Regional Analysis of Non-Methane Hydrocarbons And Meteorology of The Rural Southeast United States
1996-01-01
Zt is an ARIMA time series. This is a typical regression model , except that it allows for autocorrelation in the error term Z. In this work, an ARMA...data=folder; var residual; run; II Statistical output of 1992 regression model on 1993 ozone data ARIMA Procedure Maximum Likelihood Estimation Approx...at each of the sites, and to show the effect of synoptic meteorology on high ozone by examining NOAA daily weather maps and climatic data
Improving estimates of genetic maps: a meta-analysis-based approach.
Stewart, William C L
2007-07-01
Inaccurate genetic (or linkage) maps can reduce the power to detect linkage, increase type I error, and distort haplotype and relationship inference. To improve the accuracy of existing maps, I propose a meta-analysis-based method that combines independent map estimates into a single estimate of the linkage map. The method uses the variance of each independent map estimate to combine them efficiently, whether the map estimates use the same set of markers or not. As compared with a joint analysis of the pooled genotype data, the proposed method is attractive for three reasons: (1) it has comparable efficiency to the maximum likelihood map estimate when the pooled data are homogeneous; (2) relative to existing map estimation methods, it can have increased efficiency when the pooled data are heterogeneous; and (3) it avoids the practical difficulties of pooling human subjects data. On the basis of simulated data modeled after two real data sets, the proposed method can reduce the sampling variation of linkage maps commonly used in whole-genome linkage scans. Furthermore, when the independent map estimates are also maximum likelihood estimates, the proposed method performs as well as or better than when they are estimated by the program CRIMAP. Since variance estimates of maps may not always be available, I demonstrate the feasibility of three different variance estimators. Overall, the method should prove useful to investigators who need map positions for markers not contained in publicly available maps, and to those who wish to minimize the negative effects of inaccurate maps. Copyright 2007 Wiley-Liss, Inc.
A Parametric k-Means Algorithm
Tarpey, Thaddeus
2007-01-01
Summary The k points that optimally represent a distribution (usually in terms of a squared error loss) are called the k principal points. This paper presents a computationally intensive method that automatically determines the principal points of a parametric distribution. Cluster means from the k-means algorithm are nonparametric estimators of principal points. A parametric k-means approach is introduced for estimating principal points by running the k-means algorithm on a very large simulated data set from a distribution whose parameters are estimated using maximum likelihood. Theoretical and simulation results are presented comparing the parametric k-means algorithm to the usual k-means algorithm and an example on determining sizes of gas masks is used to illustrate the parametric k-means algorithm. PMID:17917692
Parameter estimation of an ARMA model for river flow forecasting using goal programming
NASA Astrophysics Data System (ADS)
Mohammadi, Kourosh; Eslami, H. R.; Kahawita, Rene
2006-11-01
SummaryRiver flow forecasting constitutes one of the most important applications in hydrology. Several methods have been developed for this purpose and one of the most famous techniques is the Auto regressive moving average (ARMA) model. In the research reported here, the goal was to minimize the error for a specific season of the year as well as for the complete series. Goal programming (GP) was used to estimate the ARMA model parameters. Shaloo Bridge station on the Karun River with 68 years of observed stream flow data was selected to evaluate the performance of the proposed method. The results when compared with the usual method of maximum likelihood estimation were favorable with respect to the new proposed algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, Erin A.; Robinson, Sean M.; Anderson, Kevin K.
2015-01-19
Here we present a novel technique for the localization of radiological sources in urban or rural environments from an aerial platform. The technique is based on a Bayesian approach to localization, in which measured count rates in a time series are compared with predicted count rates from a series of pre-calculated test sources to define likelihood. Furthermore, this technique is expanded by using a localized treatment with a limited field of view (FOV), coupled with a likelihood ratio reevaluation, allowing for real-time computation on commodity hardware for arbitrarily complex detector models and terrain. In particular, detectors with inherent asymmetry ofmore » response (such as those employing internal collimation or self-shielding for enhanced directional awareness) are leveraged by this approach to provide improved localization. Our results from the localization technique are shown for simulated flight data using monolithic as well as directionally-aware detector models, and the capability of the methodology to locate radioisotopes is estimated for several test cases. This localization technique is shown to facilitate urban search by allowing quick and adaptive estimates of source location, in many cases from a single flyover near a source. In particular, this method represents a significant advancement from earlier methods like full-field Bayesian likelihood, which is not generally fast enough to allow for broad-field search in real time, and highest-net-counts estimation, which has a localization error that depends strongly on flight path and cannot generally operate without exhaustive search« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vrugt, Jasper A; Robinson, Bruce A; Ter Braak, Cajo J F
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented usingmore » the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.« less
Harland, Karisa K; Carney, Cher; McGehee, Daniel
2016-07-03
The objective of this study was to estimate the prevalence and odds of fleet driver errors and potentially distracting behaviors just prior to rear-end versus angle crashes. Analysis of naturalistic driving videos among fleet services drivers for errors and potentially distracting behaviors occurring in the 6 s before crash impact. Categorical variables were examined using the Pearson's chi-square test, and continuous variables, such as eyes-off-road time, were compared using the Student's t-test. Multivariable logistic regression was used to estimate the odds of a driver error or potentially distracting behavior being present in the seconds before rear-end versus angle crashes. Of the 229 crashes analyzed, 101 (44%) were rear-end and 128 (56%) were angle crashes. Driver age, gender, and presence of passengers did not differ significantly by crash type. Over 95% of rear-end crashes involved inadequate surveillance compared to only 52% of angle crashes (P < .0001). Almost 65% of rear-end crashes involved a potentially distracting driver behavior, whereas less than 40% of angle crashes involved these behaviors (P < .01). On average, drivers spent 4.4 s with their eyes off the road while operating or manipulating their cell phone. Drivers in rear-end crashes were at 3.06 (95% confidence interval [CI], 1.73-5.44) times adjusted higher odds of being potentially distracted than those in angle crashes. Fleet driver driving errors and potentially distracting behaviors are frequent. This analysis provides data to inform safe driving interventions for fleet services drivers. Further research is needed in effective interventions to reduce the likelihood of drivers' distracting behaviors and errors that may potentially reducing crashes.
Design and analysis of three-arm trials with negative binomially distributed endpoints.
Mütze, Tobias; Munk, Axel; Friede, Tim
2016-02-20
A three-arm clinical trial design with an experimental treatment, an active control, and a placebo control, commonly referred to as the gold standard design, enables testing of non-inferiority or superiority of the experimental treatment compared with the active control. In this paper, we propose methods for designing and analyzing three-arm trials with negative binomially distributed endpoints. In particular, we develop a Wald-type test with a restricted maximum-likelihood variance estimator for testing non-inferiority or superiority. For this test, sample size and power formulas as well as optimal sample size allocations will be derived. The performance of the proposed test will be assessed in an extensive simulation study with regard to type I error rate, power, sample size, and sample size allocation. For the purpose of comparison, Wald-type statistics with a sample variance estimator and an unrestricted maximum-likelihood estimator are included in the simulation study. We found that the proposed Wald-type test with a restricted variance estimator performed well across the considered scenarios and is therefore recommended for application in clinical trials. The methods proposed are motivated and illustrated by a recent clinical trial in multiple sclerosis. The R package ThreeArmedTrials, which implements the methods discussed in this paper, is available on CRAN. Copyright © 2015 John Wiley & Sons, Ltd.
Information matrix estimation procedures for cognitive diagnostic models.
Liu, Yanlou; Xin, Tao; Andersson, Björn; Tian, Wei
2018-03-06
Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and structural parameters. The relationships between the observed information matrix, the empirical cross-product information matrix, the sandwich-type covariance matrix and the two approaches proposed by de la Torre (2009, J. Educ. Behav. Stat., 34, 115) are discussed. Simulation results show that, for a correctly specified CDM and Q-matrix or with a slightly misspecified probability model, the observed information matrix and the sandwich-type covariance matrix exhibit good performance with respect to providing consistent standard errors of item parameter estimates. However, with substantial model misspecification only the sandwich-type covariance matrix exhibits robust performance. © 2018 The British Psychological Society.
Estimating the variance for heterogeneity in arm-based network meta-analysis.
Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R
2018-04-19
Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
Use of historical control data for assessing treatment effects in clinical trials.
Viele, Kert; Berry, Scott; Neuenschwander, Beat; Amzal, Billy; Chen, Fang; Enas, Nathan; Hobbs, Brian; Ibrahim, Joseph G; Kinnersley, Nelson; Lindborg, Stacy; Micallef, Sandrine; Roychoudhury, Satrajit; Thompson, Laura
2014-01-01
Clinical trials rarely, if ever, occur in a vacuum. Generally, large amounts of clinical data are available prior to the start of a study, particularly on the current study's control arm. There is obvious appeal in using (i.e., 'borrowing') this information. With historical data providing information on the control arm, more trial resources can be devoted to the novel treatment while retaining accurate estimates of the current control arm parameters. This can result in more accurate point estimates, increased power, and reduced type I error in clinical trials, provided the historical information is sufficiently similar to the current control data. If this assumption of similarity is not satisfied, however, one can acquire increased mean square error of point estimates due to bias and either reduced power or increased type I error depending on the direction of the bias. In this manuscript, we review several methods for historical borrowing, illustrating how key parameters in each method affect borrowing behavior, and then, we compare these methods on the basis of mean square error, power and type I error. We emphasize two main themes. First, we discuss the idea of 'dynamic' (versus 'static') borrowing. Second, we emphasize the decision process involved in determining whether or not to include historical borrowing in terms of the perceived likelihood that the current control arm is sufficiently similar to the historical data. Our goal is to provide a clear review of the key issues involved in historical borrowing and provide a comparison of several methods useful for practitioners. Copyright © 2013 John Wiley & Sons, Ltd.
Use of historical control data for assessing treatment effects in clinical trials
Viele, Kert; Berry, Scott; Neuenschwander, Beat; Amzal, Billy; Chen, Fang; Enas, Nathan; Hobbs, Brian; Ibrahim, Joseph G.; Kinnersley, Nelson; Lindborg, Stacy; Micallef, Sandrine; Roychoudhury, Satrajit; Thompson, Laura
2014-01-01
Clinical trials rarely, if ever, occur in a vacuum. Generally, large amounts of clinical data are available prior to the start of a study, particularly on the current study’s control arm. There is obvious appeal in using (i.e., ‘borrowing’) this information. With historical data providing information on the control arm, more trial resources can be devoted to the novel treatment while retaining accurate estimates of the current control arm parameters. This can result in more accurate point estimates, increased power, and reduced type I error in clinical trials, provided the historical information is sufficiently similar to the current control data. If this assumption of similarity is not satisfied, however, one can acquire increased mean square error of point estimates due to bias and either reduced power or increased type I error depending on the direction of the bias. In this manuscript, we review several methods for historical borrowing, illustrating how key parameters in each method affect borrowing behavior, and then, we compare these methods on the basis of mean square error, power and type I error. We emphasize two main themes. First, we discuss the idea of ‘dynamic’ (versus ‘static’) borrowing. Second, we emphasize the decision process involved in determining whether or not to include historical borrowing in terms of the perceived likelihood that the current control arm is sufficiently similar to the historical data. Our goal is to provide a clear review of the key issues involved in historical borrowing and provide a comparison of several methods useful for practitioners. PMID:23913901
Airborne radar technology for windshear detection
NASA Technical Reports Server (NTRS)
Hibey, Joseph L.; Khalaf, Camille S.
1988-01-01
The objectives and accomplishments of the two-and-a-half year effort to describe how returns from on-board Doppler radar are to be used to detect the presence of a wind shear are reported. The problem is modeled as one of first passage in terms of state variables, the state estimates are generated by a bank of extended Kalman filters working in parallel, and the decision strategy involves the use of a voting algorithm for a series of likelihood ratio tests. The performance issue for filtering is addressed in terms of error-covariance reduction and filter divergence, and the performance issue for detection is addressed in terms of using a probability measure transformation to derive theoretical expressions for the error probabilities of a false alarm and a miss.
Parameter recovery, bias and standard errors in the linear ballistic accumulator model.
Visser, Ingmar; Poessé, Rens
2017-05-01
The linear ballistic accumulator (LBA) model (Brown & Heathcote, , Cogn. Psychol., 57, 153) is increasingly popular in modelling response times from experimental data. An R package, glba, has been developed to fit the LBA model using maximum likelihood estimation which is validated by means of a parameter recovery study. At sufficient sample sizes parameter recovery is good, whereas at smaller sample sizes there can be large bias in parameters. In a second simulation study, two methods for computing parameter standard errors are compared. The Hessian-based method is found to be adequate and is (much) faster than the alternative bootstrap method. The use of parameter standard errors in model selection and inference is illustrated in an example using data from an implicit learning experiment (Visser et al., , Mem. Cogn., 35, 1502). It is shown that typical implicit learning effects are captured by different parameters of the LBA model. © 2017 The British Psychological Society.
Ménard, Richard; Deshaies-Jacques, Martin; Gasset, Nicolas
2016-09-01
An objective analysis is one of the main components of data assimilation. By combining observations with the output of a predictive model we combine the best features of each source of information: the complete spatial and temporal coverage provided by models, with a close representation of the truth provided by observations. The process of combining observations with a model output is called an analysis. To produce an analysis requires the knowledge of observation and model errors, as well as its spatial correlation. This paper is devoted to the development of methods of estimation of these error variances and the characteristic length-scale of the model error correlation for its operational use in the Canadian objective analysis system. We first argue in favor of using compact support correlation functions, and then introduce three estimation methods: the Hollingsworth-Lönnberg (HL) method in local and global form, the maximum likelihood method (ML), and the [Formula: see text] diagnostic method. We perform one-dimensional (1D) simulation studies where the error variance and true correlation length are known, and perform an estimation of both error variances and correlation length where both are non-uniform. We show that a local version of the HL method can capture accurately the error variances and correlation length at each observation site, provided that spatial variability is not too strong. However, the operational objective analysis requires only a single and globally valid correlation length. We examine whether any statistics of the local HL correlation lengths could be a useful estimate, or whether other global estimation methods such as by the global HL, ML, or [Formula: see text] should be used. We found in both 1D simulation and using real data that the ML method is able to capture physically significant aspects of the correlation length, while most other estimates give unphysical and larger length-scale values. This paper describes a proposed improvement of the objective analysis of surface pollutants at Environment and Climate Change Canada (formerly known as Environment Canada). Objective analyses are essentially surface maps of air pollutants that are obtained by combining observations with an air quality model output, and are thought to provide a complete and more accurate representation of the air quality. The highlight of this study is an analysis of methods to estimate the model (or background) error correlation length-scale. The error statistics are an important and critical component to the analysis scheme.
Inference from single occasion capture experiments using genetic markers.
Hettiarachchige, Chathurika K H; Huggins, Richard M
2018-05-01
Accurate estimation of the size of animal populations is an important task in ecological science. Recent advances in the field of molecular genetics researches allow the use of genetic data to estimate the size of a population from a single capture occasion rather than repeated occasions as in the usual capture-recapture experiments. Estimating the population size using genetic data also has sometimes led to estimates that differ markedly from each other and also from classical capture-recapture estimates. Here, we develop a closed form estimator that uses genetic information to estimate the size of a population consisting of mothers and daughters, focusing on estimating the number of mothers, using data from a single sample. We demonstrate the estimator is consistent and propose a parametric bootstrap to estimate the standard errors. The estimator is evaluated in a simulation study and applied to real data. We also consider maximum likelihood in this setting and discover problems that preclude its general use. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Using beta binomials to estimate classification uncertainty for ensemble models.
Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin
2014-01-01
Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
Kumar, Poornima; Eickhoff, Simon B.; Dombrovski, Alexandre Y.
2015-01-01
Reinforcement learning describes motivated behavior in terms of two abstract signals. The representation of discrepancies between expected and actual rewards/punishments – prediction error – is thought to update the expected value of actions and predictive stimuli. Electrophysiological and lesion studies suggest that mesostriatal prediction error signals control behavior through synaptic modification of cortico-striato-thalamic networks. Signals in the ventromedial prefrontal and orbitofrontal cortex are implicated in representing expected value. To obtain unbiased maps of these representations in the human brain, we performed a meta-analysis of functional magnetic resonance imaging studies that employed algorithmic reinforcement learning models, across a variety of experimental paradigms. We found that the ventral striatum (medial and lateral) and midbrain/thalamus represented reward prediction errors, consistent with animal studies. Prediction error signals were also seen in the frontal operculum/insula, particularly for social rewards. In Pavlovian studies, striatal prediction error signals extended into the amygdala, while instrumental tasks engaged the caudate. Prediction error maps were sensitive to the model-fitting procedure (fixed or individually-estimated) and to the extent of spatial smoothing. A correlate of expected value was found in a posterior region of the ventromedial prefrontal cortex, caudal and medial to the orbitofrontal regions identified in animal studies. These findings highlight a reproducible motif of reinforcement learning in the cortico-striatal loops and identify methodological dimensions that may influence the reproducibility of activation patterns across studies. PMID:25665667
NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models
Rice, John D.; Taylor, Jeremy M. G.
2016-01-01
One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492
NASA Astrophysics Data System (ADS)
Zin, Wan Zawiah Wan; Shinyie, Wendy Ling; Jemain, Abdul Aziz
2015-02-01
In this study, two series of data for extreme rainfall events are generated based on Annual Maximum and Partial Duration Methods, derived from 102 rain-gauge stations in Peninsular from 1982-2012. To determine the optimal threshold for each station, several requirements must be satisfied and Adapted Hill estimator is employed for this purpose. A semi-parametric bootstrap is then used to estimate the mean square error (MSE) of the estimator at each threshold and the optimal threshold is selected based on the smallest MSE. The mean annual frequency is also checked to ensure that it lies in the range of one to five and the resulting data is also de-clustered to ensure independence. The two data series are then fitted to Generalized Extreme Value and Generalized Pareto distributions for annual maximum and partial duration series, respectively. The parameter estimation methods used are the Maximum Likelihood and the L-moment methods. Two goodness of fit tests are then used to evaluate the best-fitted distribution. The results showed that the Partial Duration series with Generalized Pareto distribution and Maximum Likelihood parameter estimation provides the best representation for extreme rainfall events in Peninsular Malaysia for majority of the stations studied. Based on these findings, several return values are also derived and spatial mapping are constructed to identify the distribution characteristic of extreme rainfall in Peninsular Malaysia.
Determining the accuracy of maximum likelihood parameter estimates with colored residuals
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; Klein, Vladislav
1994-01-01
An important part of building high fidelity mathematical models based on measured data is calculating the accuracy associated with statistical estimates of the model parameters. Indeed, without some idea of the accuracy of parameter estimates, the estimates themselves have limited value. In this work, an expression based on theoretical analysis was developed to properly compute parameter accuracy measures for maximum likelihood estimates with colored residuals. This result is important because experience from the analysis of measured data reveals that the residuals from maximum likelihood estimation are almost always colored. The calculations involved can be appended to conventional maximum likelihood estimation algorithms. Simulated data runs were used to show that the parameter accuracy measures computed with this technique accurately reflect the quality of the parameter estimates from maximum likelihood estimation without the need for analysis of the output residuals in the frequency domain or heuristically determined multiplication factors. The result is general, although the application studied here is maximum likelihood estimation of aerodynamic model parameters from flight test data.
Network Model-Assisted Inference from Respondent-Driven Sampling Data
Gile, Krista J.; Handcock, Mark S.
2015-01-01
Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328
Network Model-Assisted Inference from Respondent-Driven Sampling Data.
Gile, Krista J; Handcock, Mark S
2015-06-01
Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.
Multi crop area estimation in Idaho using EDITOR
NASA Technical Reports Server (NTRS)
Sheffner, E. J.
1984-01-01
The use of LANDSAT multispectral scanner digital data for multi-crop acreage estimation in the central Snake River Plain of Idaho was examined. Two acquisitions of LANDSAT data covering ground sample units selected from a U.S. Department of Agriculture sampling frame in a four country study site were used to train a maximum likelihood classifier which, subsequently, classified all picture elements in the study site. Acreage estimates for six major crops, by county and for the four counties combined, were generated from the classification using the Battesse-Fuller model for estimation by regression in small areas. Results from the regression analysis were compared to those obtained by direct expansion of the ground data. Using the LANDSAT data significantly decreased the errors associated with the estimates for the three largest acreage crops. The late date of the second LANDSAT acquisition may have contributed to the poor results for three summer crops.
Offline handwritten word recognition using MQDF-HMMs
NASA Astrophysics Data System (ADS)
Ramachandrula, Sitaram; Hambarde, Mangesh; Patial, Ajay; Sahoo, Dushyant; Kochar, Shaivi
2015-01-01
We propose an improved HMM formulation for offline handwriting recognition (HWR). The main contribution of this work is using modified quadratic discriminant function (MQDF) [1] within HMM framework. In an MQDF-HMM the state observation likelihood is calculated by a weighted combination of MQDF likelihoods of individual Gaussians of GMM (Gaussian Mixture Model). The quadratic discriminant function (QDF) of a multivariate Gaussian can be rewritten by avoiding the inverse of covariance matrix by using the Eigen values and Eigen vectors of it. The MQDF is derived from QDF by substituting few of badly estimated lower-most Eigen values by an appropriate constant. The estimation errors of non-dominant Eigen vectors and Eigen values of covariance matrix for which the training data is insufficient can be controlled by this approach. MQDF has been successfully shown to improve the character recognition performance [1]. The usage of MQDF in HMM improves the computation, storage and modeling power of HMM when there is limited training data. We have got encouraging results on offline handwritten character (NIST database) and word recognition in English using MQDF HMMs.
Estimating linear-nonlinear models using Rényi divergences
Kouh, Minjoon; Sharpee, Tatyana O.
2009-01-01
This paper compares a family of methods for characterizing neural feature selectivity using natural stimuli in the framework of the linear-nonlinear model. In this model, the spike probability depends in a nonlinear way on a small number of stimulus dimensions. The relevant stimulus dimensions can be found by optimizing a Rényi divergence that quantifies a change in the stimulus distribution associated with the arrival of single spikes. Generally, good reconstructions can be obtained based on optimization of Rényi divergence of any order, even in the limit of small numbers of spikes. However, the smallest error is obtained when the Rényi divergence of order 1 is optimized. This type of optimization is equivalent to information maximization, and is shown to saturate the Cramér-Rao bound describing the smallest error allowed for any unbiased method. We also discuss conditions under which information maximization provides a convenient way to perform maximum likelihood estimation of linear-nonlinear models from neural data. PMID:19568981
Estimating linear-nonlinear models using Renyi divergences.
Kouh, Minjoon; Sharpee, Tatyana O
2009-01-01
This article compares a family of methods for characterizing neural feature selectivity using natural stimuli in the framework of the linear-nonlinear model. In this model, the spike probability depends in a nonlinear way on a small number of stimulus dimensions. The relevant stimulus dimensions can be found by optimizing a Rényi divergence that quantifies a change in the stimulus distribution associated with the arrival of single spikes. Generally, good reconstructions can be obtained based on optimization of Rényi divergence of any order, even in the limit of small numbers of spikes. However, the smallest error is obtained when the Rényi divergence of order 1 is optimized. This type of optimization is equivalent to information maximization, and is shown to saturate the Cramer-Rao bound describing the smallest error allowed for any unbiased method. We also discuss conditions under which information maximization provides a convenient way to perform maximum likelihood estimation of linear-nonlinear models from neural data.
On the existence of maximum likelihood estimates for presence-only data
Hefley, Trevor J.; Hooten, Mevin B.
2015-01-01
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence-only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
Static shape control for flexible structures
NASA Technical Reports Server (NTRS)
Rodriguez, G.; Scheid, R. E., Jr.
1986-01-01
An integrated methodology is described for defining static shape control laws for large flexible structures. The techniques include modeling, identifying and estimating the control laws of distributed systems characterized in terms of infinite dimensional state and parameter spaces. The models are expressed as interconnected elliptic partial differential equations governing a range of static loads, with the capability of analyzing electromagnetic fields around antenna systems. A second-order analysis is carried out for statistical errors, and model parameters are determined by maximizing an appropriate defined likelihood functional which adjusts the model to observational data. The parameter estimates are derived from the conditional mean of the observational data, resulting in a least squares superposition of shape functions obtained from the structural model.
Underestimating numerosity of items in visual search tasks.
Cassenti, Daniel N; Kelley, Troy D; Ghirardelli, Thomas G
2010-10-01
Previous research on numerosity judgments addressed attended items, while the present research addresses underestimation for unattended items in visual search tasks. One potential cause of underestimation for unattended items is that estimates of quantity may depend on viewing a large portion of the display within foveal vision. Another theory follows from the occupancy model: estimating quantity of items in greater proximity to one another increases the likelihood of an underestimation error. Three experimental manipulations addressed aspects of underestimation for unattended items: the size of the distracters, the distance of the target from fixation, and whether items were clustered together. Results suggested that the underestimation effect for unattended items was best explained within a Gestalt grouping framework.
Adaptive Offset Correction for Intracortical Brain Computer Interfaces
Homer, Mark L.; Perge, János A.; Black, Michael J.; Harrison, Matthew T.; Cash, Sydney S.; Hochberg, Leigh R.
2014-01-01
Intracortical brain computer interfaces (iBCIs) decode intended movement from neural activity for the control of external devices such as a robotic arm. Standard approaches include a calibration phase to estimate decoding parameters. During iBCI operation, the statistical properties of the neural activity can depart from those observed during calibration, sometimes hindering a user’s ability to control the iBCI. To address this problem, we adaptively correct the offset terms within a Kalman filter decoder via penalized maximum likelihood estimation. The approach can handle rapid shifts in neural signal behavior (on the order of seconds) and requires no knowledge of the intended movement. The algorithm, called MOCA, was tested using simulated neural activity and evaluated retrospectively using data collected from two people with tetraplegia operating an iBCI. In 19 clinical research test cases, where a nonadaptive Kalman filter yielded relatively high decoding errors, MOCA significantly reduced these errors (10.6 ±10.1%; p<0.05, pairwise t-test). MOCA did not significantly change the error in the remaining 23 cases where a nonadaptive Kalman filter already performed well. These results suggest that MOCA provides more robust decoding than the standard Kalman filter for iBCIs. PMID:24196868
Adaptive offset correction for intracortical brain-computer interfaces.
Homer, Mark L; Perge, Janos A; Black, Michael J; Harrison, Matthew T; Cash, Sydney S; Hochberg, Leigh R
2014-03-01
Intracortical brain-computer interfaces (iBCIs) decode intended movement from neural activity for the control of external devices such as a robotic arm. Standard approaches include a calibration phase to estimate decoding parameters. During iBCI operation, the statistical properties of the neural activity can depart from those observed during calibration, sometimes hindering a user's ability to control the iBCI. To address this problem, we adaptively correct the offset terms within a Kalman filter decoder via penalized maximum likelihood estimation. The approach can handle rapid shifts in neural signal behavior (on the order of seconds) and requires no knowledge of the intended movement. The algorithm, called multiple offset correction algorithm (MOCA), was tested using simulated neural activity and evaluated retrospectively using data collected from two people with tetraplegia operating an iBCI. In 19 clinical research test cases, where a nonadaptive Kalman filter yielded relatively high decoding errors, MOCA significantly reduced these errors ( 10.6 ± 10.1% ; p < 0.05, pairwise t-test). MOCA did not significantly change the error in the remaining 23 cases where a nonadaptive Kalman filter already performed well. These results suggest that MOCA provides more robust decoding than the standard Kalman filter for iBCIs.
NASA Astrophysics Data System (ADS)
Sun, Ruochen; Yuan, Huiling; Liu, Xiaoli
2017-11-01
The heteroscedasticity treatment in residual error models directly impacts the model calibration and prediction uncertainty estimation. This study compares three methods to deal with the heteroscedasticity, including the explicit linear modeling (LM) method and nonlinear modeling (NL) method using hyperbolic tangent function, as well as the implicit Box-Cox transformation (BC). Then a combined approach (CA) combining the advantages of both LM and BC methods has been proposed. In conjunction with the first order autoregressive model and the skew exponential power (SEP) distribution, four residual error models are generated, namely LM-SEP, NL-SEP, BC-SEP and CA-SEP, and their corresponding likelihood functions are applied to the Variable Infiltration Capacity (VIC) hydrologic model over the Huaihe River basin, China. Results show that the LM-SEP yields the poorest streamflow predictions with the widest uncertainty band and unrealistic negative flows. The NL and BC methods can better deal with the heteroscedasticity and hence their corresponding predictive performances are improved, yet the negative flows cannot be avoided. The CA-SEP produces the most accurate predictions with the highest reliability and effectively avoids the negative flows, because the CA approach is capable of addressing the complicated heteroscedasticity over the study basin.
NASA Astrophysics Data System (ADS)
Hincks, Ian; Granade, Christopher; Cory, David G.
2018-01-01
The analysis of photon count data from the standard nitrogen vacancy (NV) measurement process is treated as a statistical inference problem. This has applications toward gaining better and more rigorous error bars for tasks such as parameter estimation (e.g. magnetometry), tomography, and randomized benchmarking. We start by providing a summary of the standard phenomenological model of the NV optical process in terms of Lindblad jump operators. This model is used to derive random variables describing emitted photons during measurement, to which finite visibility, dark counts, and imperfect state preparation are added. NV spin-state measurement is then stated as an abstract statistical inference problem consisting of an underlying biased coin obstructed by three Poisson rates. Relevant frequentist and Bayesian estimators are provided, discussed, and quantitatively compared. We show numerically that the risk of the maximum likelihood estimator is well approximated by the Cramér-Rao bound, for which we provide a simple formula. Of the estimators, we in particular promote the Bayes estimator, owing to its slightly better risk performance, and straightforward error propagation into more complex experiments. This is illustrated on experimental data, where quantum Hamiltonian learning is performed and cross-validated in a fully Bayesian setting, and compared to a more traditional weighted least squares fit.
Disclosure of Medical Errors: What Factors Influence How Patients Respond?
Mazor, Kathleen M; Reed, George W; Yood, Robert A; Fischer, Melissa A; Baril, Joann; Gurwitz, Jerry H
2006-01-01
BACKGROUND Disclosure of medical errors is encouraged, but research on how patients respond to specific practices is limited. OBJECTIVE This study sought to determine whether full disclosure, an existing positive physician-patient relationship, an offer to waive associated costs, and the severity of the clinical outcome influenced patients' responses to medical errors. PARTICIPANTS Four hundred and seven health plan members participated in a randomized experiment in which they viewed video depictions of medical error and disclosure. DESIGN Subjects were randomly assigned to experimental condition. Conditions varied in type of medication error, level of disclosure, reference to a prior positive physician-patient relationship, an offer to waive costs, and clinical outcome. MEASURES Self-reported likelihood of changing physicians and of seeking legal advice; satisfaction, trust, and emotional response. RESULTS Nondisclosure increased the likelihood of changing physicians, and reduced satisfaction and trust in both error conditions. Nondisclosure increased the likelihood of seeking legal advice and was associated with a more negative emotional response in the missed allergy error condition, but did not have a statistically significant impact on seeking legal advice or emotional response in the monitoring error condition. Neither the existence of a positive relationship nor an offer to waive costs had a statistically significant impact. CONCLUSIONS This study provides evidence that full disclosure is likely to have a positive effect or no effect on how patients respond to medical errors. The clinical outcome also influences patients' responses. The impact of an existing positive physician-patient relationship, or of waiving costs associated with the error remains uncertain. PMID:16808770
Estimating Function Approaches for Spatial Point Processes
NASA Astrophysics Data System (ADS)
Deng, Chong
Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
Radar modulation classification using time-frequency representation and nonlinear regression
NASA Astrophysics Data System (ADS)
De Luigi, Christophe; Arques, Pierre-Yves; Lopez, Jean-Marc; Moreau, Eric
1999-09-01
In naval electronic environment, pulses emitted by radars are collected by ESM receivers. For most of them the intrapulse signal is modulated by a particular law. To help the classical identification process, a classification and estimation of this modulation law is applied on the intrapulse signal measurements. To estimate with a good accuracy the time-varying frequency of a signal corrupted by an additive noise, one method has been chosen. This method consists on the Wigner distribution calculation, the instantaneous frequency is then estimated by the peak location of the distribution. Bias and variance of the estimator are performed by computed simulations. In a estimated sequence of frequencies, we assume the presence of false and good estimated ones, the hypothesis of Gaussian distribution is made on the errors. A robust non linear regression method, based on the Levenberg-Marquardt algorithm, is thus applied on these estimated frequencies using a Maximum Likelihood Estimator. The performances of the method are tested by using varied modulation laws and different signal to noise ratios.
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Sangnawakij, Patarawan; Böhning, Dankmar; Adams, Stephen; Stanton, Michael; Holling, Heinz
2017-04-30
Statistical inference for analyzing the results from several independent studies on the same quantity of interest has been investigated frequently in recent decades. Typically, any meta-analytic inference requires that the quantity of interest is available from each study together with an estimate of its variability. The current work is motivated by a meta-analysis on comparing two treatments (thoracoscopic and open) of congenital lung malformations in young children. Quantities of interest include continuous end-points such as length of operation or number of chest tube days. As studies only report mean values (and no standard errors or confidence intervals), the question arises how meta-analytic inference can be developed. We suggest two methods to estimate study-specific variances in such a meta-analysis, where only sample means and sample sizes are available in the treatment arms. A general likelihood ratio test is derived for testing equality of variances in two groups. By means of simulation studies, the bias and estimated standard error of the overall mean difference from both methodologies are evaluated and compared with two existing approaches: complete study analysis only and partial variance information. The performance of the test is evaluated in terms of type I error. Additionally, we illustrate these methods in the meta-analysis on comparing thoracoscopic and open surgery for congenital lung malformations and in a meta-analysis on the change in renal function after kidney donation. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Design and simulation of sensor networks for tracking Wifi users in outdoor urban environments
NASA Astrophysics Data System (ADS)
Thron, Christopher; Tran, Khoi; Smith, Douglas; Benincasa, Daniel
2017-05-01
We present a proof-of-concept investigation into the use of sensor networks for tracking of WiFi users in outdoor urban environments. Sensors are fixed, and are capable of measuring signal power from users' WiFi devices. We derive a maximum likelihood estimate for user location based on instantaneous sensor power measurements. The algorithm takes into account the effects of power control, and is self-calibrating in that the signal power model used by the location algorithm is adjusted and improved as part of the operation of the network. Simulation results to verify the system's performance are presented. The simulation scenario is based on a 1.5 km2 area of lower Manhattan, The self-calibration mechanism was verified for initial rms (root mean square) errors of up to 12 dB in the channel power estimates: rms errors were reduced by over 60% in 300 track-hours, in systems with limited power control. Under typical operating conditions with (without) power control, location rms errors are about 8.5 (5) meters with 90% accuracy within 9 (13) meters, for both pedestrian and vehicular users. The distance error distributions for smaller distances (<30 m) are well-approximated by an exponential distribution, while the distributions for large distance errors have fat tails. The issue of optimal sensor placement in the sensor network is also addressed. We specify a linear programming algorithm for determining sensor placement for networks with reduced number of sensors. In our test case, the algorithm produces a network with 18.5% fewer sensors with comparable accuracy estimation performance. Finally, we discuss future research directions for improving the accuracy and capabilities of sensor network systems in urban environments.
Lobach, Iryna; Mallick, Bani; Carroll, Raymond J
2011-01-01
Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.
Genetic mapping in the presence of genotyping errors.
Cartwright, Dustin A; Troggio, Michela; Velasco, Riccardo; Gutin, Alexander
2007-08-01
Genetic maps are built using the genotypes of many related individuals. Genotyping errors in these data sets can distort genetic maps, especially by inflating the distances. We have extended the traditional likelihood model used for genetic mapping to include the possibility of genotyping errors. Each individual marker is assigned an error rate, which is inferred from the data, just as the genetic distances are. We have developed a software package, called TMAP, which uses this model to find maximum-likelihood maps for phase-known pedigrees. We have tested our methods using a data set in Vitis and on simulated data and confirmed that our method dramatically reduces the inflationary effect caused by increasing the number of markers and leads to more accurate orders.
Genetic Mapping in the Presence of Genotyping Errors
Cartwright, Dustin A.; Troggio, Michela; Velasco, Riccardo; Gutin, Alexander
2007-01-01
Genetic maps are built using the genotypes of many related individuals. Genotyping errors in these data sets can distort genetic maps, especially by inflating the distances. We have extended the traditional likelihood model used for genetic mapping to include the possibility of genotyping errors. Each individual marker is assigned an error rate, which is inferred from the data, just as the genetic distances are. We have developed a software package, called TMAP, which uses this model to find maximum-likelihood maps for phase-known pedigrees. We have tested our methods using a data set in Vitis and on simulated data and confirmed that our method dramatically reduces the inflationary effect caused by increasing the number of markers and leads to more accurate orders. PMID:17277374
Minimizing treatment planning errors in proton therapy using failure mode and effects analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zheng, Yuanshui, E-mail: yuanshui.zheng@okc.procure.com; Johnson, Randall; Larson, Gary
Purpose: Failure mode and effects analysis (FMEA) is a widely used tool to evaluate safety or reliability in conventional photon radiation therapy. However, reports about FMEA application in proton therapy are scarce. The purpose of this study is to apply FMEA in safety improvement of proton treatment planning at their center. Methods: The authors performed an FMEA analysis of their proton therapy treatment planning process using uniform scanning proton beams. The authors identified possible failure modes in various planning processes, including image fusion, contouring, beam arrangement, dose calculation, plan export, documents, billing, and so on. For each error, the authorsmore » estimated the frequency of occurrence, the likelihood of being undetected, and the severity of the error if it went undetected and calculated the risk priority number (RPN). The FMEA results were used to design their quality management program. In addition, the authors created a database to track the identified dosimetric errors. Periodically, the authors reevaluated the risk of errors by reviewing the internal error database and improved their quality assurance program as needed. Results: In total, the authors identified over 36 possible treatment planning related failure modes and estimated the associated occurrence, detectability, and severity to calculate the overall risk priority number. Based on the FMEA, the authors implemented various safety improvement procedures into their practice, such as education, peer review, and automatic check tools. The ongoing error tracking database provided realistic data on the frequency of occurrence with which to reevaluate the RPNs for various failure modes. Conclusions: The FMEA technique provides a systematic method for identifying and evaluating potential errors in proton treatment planning before they result in an error in patient dose delivery. The application of FMEA framework and the implementation of an ongoing error tracking system at their clinic have proven to be useful in error reduction in proton treatment planning, thus improving the effectiveness and safety of proton therapy.« less
Minimizing treatment planning errors in proton therapy using failure mode and effects analysis.
Zheng, Yuanshui; Johnson, Randall; Larson, Gary
2016-06-01
Failure mode and effects analysis (FMEA) is a widely used tool to evaluate safety or reliability in conventional photon radiation therapy. However, reports about FMEA application in proton therapy are scarce. The purpose of this study is to apply FMEA in safety improvement of proton treatment planning at their center. The authors performed an FMEA analysis of their proton therapy treatment planning process using uniform scanning proton beams. The authors identified possible failure modes in various planning processes, including image fusion, contouring, beam arrangement, dose calculation, plan export, documents, billing, and so on. For each error, the authors estimated the frequency of occurrence, the likelihood of being undetected, and the severity of the error if it went undetected and calculated the risk priority number (RPN). The FMEA results were used to design their quality management program. In addition, the authors created a database to track the identified dosimetric errors. Periodically, the authors reevaluated the risk of errors by reviewing the internal error database and improved their quality assurance program as needed. In total, the authors identified over 36 possible treatment planning related failure modes and estimated the associated occurrence, detectability, and severity to calculate the overall risk priority number. Based on the FMEA, the authors implemented various safety improvement procedures into their practice, such as education, peer review, and automatic check tools. The ongoing error tracking database provided realistic data on the frequency of occurrence with which to reevaluate the RPNs for various failure modes. The FMEA technique provides a systematic method for identifying and evaluating potential errors in proton treatment planning before they result in an error in patient dose delivery. The application of FMEA framework and the implementation of an ongoing error tracking system at their clinic have proven to be useful in error reduction in proton treatment planning, thus improving the effectiveness and safety of proton therapy.
E-prescribing errors in community pharmacies: exploring consequences and contributing factors.
Odukoya, Olufunmilola K; Stone, Jamie A; Chui, Michelle A
2014-06-01
To explore types of e-prescribing errors in community pharmacies and their potential consequences, as well as the factors that contribute to e-prescribing errors. Data collection involved performing 45 total hours of direct observations in five pharmacies. Follow-up interviews were conducted with 20 study participants. Transcripts from observations and interviews were subjected to content analysis using NVivo 10. Pharmacy staff detected 75 e-prescription errors during the 45 h observation in pharmacies. The most common e-prescribing errors were wrong drug quantity, wrong dosing directions, wrong duration of therapy, and wrong dosage formulation. Participants estimated that 5 in 100 e-prescriptions have errors. Drug classes that were implicated in e-prescribing errors were antiinfectives, inhalers, ophthalmic, and topical agents. The potential consequences of e-prescribing errors included increased likelihood of the patient receiving incorrect drug therapy, poor disease management for patients, additional work for pharmacy personnel, increased cost for pharmacies and patients, and frustrations for patients and pharmacy staff. Factors that contribute to errors included: technology incompatibility between pharmacy and clinic systems, technology design issues such as use of auto-populate features and dropdown menus, and inadvertently entering incorrect information. Study findings suggest that a wide range of e-prescribing errors is encountered in community pharmacies. Pharmacists and technicians perceive that causes of e-prescribing errors are multidisciplinary and multifactorial, that is to say e-prescribing errors can originate from technology used in prescriber offices and pharmacies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
E-Prescribing Errors in Community Pharmacies: Exploring Consequences and Contributing Factors
Stone, Jamie A.; Chui, Michelle A.
2014-01-01
Objective To explore types of e-prescribing errors in community pharmacies and their potential consequences, as well as the factors that contribute to e-prescribing errors. Methods Data collection involved performing 45 total hours of direct observations in five pharmacies. Follow-up interviews were conducted with 20 study participants. Transcripts from observations and interviews were subjected to content analysis using NVivo 10. Results Pharmacy staff detected 75 e-prescription errors during the 45 hour observation in pharmacies. The most common e-prescribing errors were wrong drug quantity, wrong dosing directions, wrong duration of therapy, and wrong dosage formulation. Participants estimated that 5 in 100 e-prescriptions have errors. Drug classes that were implicated in e-prescribing errors were antiinfectives, inhalers, ophthalmic, and topical agents. The potential consequences of e-prescribing errors included increased likelihood of the patient receiving incorrect drug therapy, poor disease management for patients, additional work for pharmacy personnel, increased cost for pharmacies and patients, and frustrations for patients and pharmacy staff. Factors that contribute to errors included: technology incompatibility between pharmacy and clinic systems, technology design issues such as use of auto-populate features and dropdown menus, and inadvertently entering incorrect information. Conclusion Study findings suggest that a wide range of e-prescribing errors are encountered in community pharmacies. Pharmacists and technicians perceive that causes of e-prescribing errors are multidisciplinary and multifactorial, that is to say e-prescribing errors can originate from technology used in prescriber offices and pharmacies. PMID:24657055
NASA Astrophysics Data System (ADS)
Winiarek, Victor; Bocquet, Marc; Duhanyan, Nora; Roustan, Yelva; Saunier, Olivier; Mathieu, Anne
2013-04-01
A major difficulty when inverting the source term of an atmospheric tracer dispersion problem is the estimation of the prior errors: those of the atmospheric transport model, those ascribed to the representativeness of the measurements, the instrumental errors, and those attached to the prior knowledge on the variables one seeks to retrieve. In the case of an accidental release of pollutant, and specially in a situation of sparse observability, the reconstructed source is sensitive to these assumptions. This sensitivity makes the quality of the retrieval dependent on the methods used to model and estimate the prior errors of the inverse modeling scheme. In Winiarek et al. (2012), we proposed to use an estimation method for the errors' amplitude based on the maximum likelihood principle. Under semi-Gaussian assumptions, it takes into account, without approximation, the positivity assumption on the source. We applied the method to the estimation of the Fukushima Daiichi cesium-137 and iodine-131 source terms using activity concentrations in the air. The results were compared to an L-curve estimation technique, and to Desroziers's scheme. Additionally to the estimations of released activities, we provided related uncertainties (12 PBq with a std. of 15 - 20 % for cesium-137 and 190 - 380 PBq with a std. of 5 - 10 % for iodine-131). We also enlightened that, because of the low number of available observations (few hundreds) and even if orders of magnitude were consistent, the reconstructed activities significantly depended on the method used to estimate the prior errors. In order to use more data, we propose to extend the methods to the use of several data types, such as activity concentrations in the air and fallout measurements. The idea is to simultaneously estimate the prior errors related to each dataset, in order to fully exploit the information content of each one. Using the activity concentration measurements, but also daily fallout data from prefectures and cumulated deposition data over a region lying approximately 150 km around the nuclear power plant, we can use a few thousands of data in our inverse modeling algorithm to reconstruct the Cesium-137 source term. To improve the parameterization of removal processes, rainfall fields have also been corrected using outputs from the mesoscale meteorological model WRF and ground station rainfall data. As expected, the different methods yield closer results as the number of data increases. Reference : Winiarek, V., M. Bocquet, O. Saunier, A. Mathieu (2012), Estimation of errors in the inverse modeling of accidental release of atmospheric pollutant : Application to the reconstruction of the cesium-137 and iodine-131 source terms from the Fukushima Daiichi power plant, J. Geophys. Res., 117, D05122, doi:10.1029/2011JD016932.
Dynamic Financial Constraints: Distinguishing Mechanism Design from Exogenously Incomplete Regimes*
Karaivanov, Alexander; Townsend, Robert M.
2014-01-01
We formulate and solve a range of dynamic models of constrained credit/insurance that allow for moral hazard and limited commitment. We compare them to full insurance and exogenously incomplete financial regimes (autarky, saving only, borrowing and lending in a single asset). We develop computational methods based on mechanism design, linear programming, and maximum likelihood to estimate, compare, and statistically test these alternative dynamic models with financial/information constraints. Our methods can use both cross-sectional and panel data and allow for measurement error and unobserved heterogeneity. We estimate the models using data on Thai households running small businesses from two separate samples. We find that in the rural sample, the exogenously incomplete saving only and borrowing regimes provide the best fit using data on consumption, business assets, investment, and income. Family and other networks help consumption smoothing there, as in a moral hazard constrained regime. In contrast, in urban areas, we find mechanism design financial/information regimes that are decidedly less constrained, with the moral hazard model fitting best combined business and consumption data. We perform numerous robustness checks in both the Thai data and in Monte Carlo simulations and compare our maximum likelihood criterion with results from other metrics and data not used in the estimation. A prototypical counterfactual policy evaluation exercise using the estimation results is also featured. PMID:25246710
NASA Astrophysics Data System (ADS)
Krestyannikov, E.; Tohka, J.; Ruotsalainen, U.
2008-06-01
This paper presents a novel statistical approach for joint estimation of regions-of-interest (ROIs) and the corresponding time-activity curves (TACs) from dynamic positron emission tomography (PET) brain projection data. It is based on optimizing the joint objective function that consists of a data log-likelihood term and two penalty terms reflecting the available a priori information about the human brain anatomy. The developed local optimization strategy iteratively updates both the ROI and TAC parameters and is guaranteed to monotonically increase the objective function. The quantitative evaluation of the algorithm is performed with numerically and Monte Carlo-simulated dynamic PET brain data of the 11C-Raclopride and 18F-FDG tracers. The results demonstrate that the method outperforms the existing sequential ROI quantification approaches in terms of accuracy, and can noticeably reduce the errors in TACs arising due to the finite spatial resolution and ROI delineation.
Errors as a Means of Reducing Impulsive Food Choice.
Sellitto, Manuela; di Pellegrino, Giuseppe
2016-06-05
Nowadays, the increasing incidence of eating disorders due to poor self-control has given rise to increased obesity and other chronic weight problems, and ultimately, to reduced life expectancy. The capacity to refrain from automatic responses is usually high in situations in which making errors is highly likely. The protocol described here aims at reducing imprudent preference in women during hypothetical intertemporal choices about appetitive food by associating it with errors. First, participants undergo an error task where two different edible stimuli are associated with two different error likelihoods (high and low). Second, they make intertemporal choices about the two edible stimuli, separately. As a result, this method decreases the discount rate for future amounts of the edible reward that cued higher error likelihood, selectively. This effect is under the influence of the self-reported hunger level. The present protocol demonstrates that errors, well known as motivationally salient events, can induce the recruitment of cognitive control, thus being ultimately useful in reducing impatient choices for edible commodities.
Errors as a Means of Reducing Impulsive Food Choice
Sellitto, Manuela; di Pellegrino, Giuseppe
2016-01-01
Nowadays, the increasing incidence of eating disorders due to poor self-control has given rise to increased obesity and other chronic weight problems, and ultimately, to reduced life expectancy. The capacity to refrain from automatic responses is usually high in situations in which making errors is highly likely. The protocol described here aims at reducing imprudent preference in women during hypothetical intertemporal choices about appetitive food by associating it with errors. First, participants undergo an error task where two different edible stimuli are associated with two different error likelihoods (high and low). Second, they make intertemporal choices about the two edible stimuli, separately. As a result, this method decreases the discount rate for future amounts of the edible reward that cued higher error likelihood, selectively. This effect is under the influence of the self-reported hunger level. The present protocol demonstrates that errors, well known as motivationally salient events, can induce the recruitment of cognitive control, thus being ultimately useful in reducing impatient choices for edible commodities. PMID:27341281
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
NASA Astrophysics Data System (ADS)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic systemmore » leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.« less
SEMIPARAMETRIC EFFICIENT ESTIMATION FOR SHARED-FRAILTY MODELS WITH DOUBLY-CENSORED CLUSTERED DATA
Wang, Jane-Ling
2018-01-01
In this paper, we investigate frailty models for clustered survival data that are subject to both left- and right-censoring, termed “doubly-censored data”. This model extends current survival literature by broadening the application of frailty models from right-censoring to a more complicated situation with additional left censoring. Our approach is motivated by a recent Hepatitis B study where the sample consists of families. We adopt a likelihood approach that aims at the nonparametric maximum likelihood estimators (NPMLE). A new algorithm is proposed, which not only works well for clustered data but also improve over existing algorithm for independent and doubly-censored data, a special case when the frailty variable is a constant equal to one. This special case is well known to be a computational challenge due to the left censoring feature of the data. The new algorithm not only resolves this challenge but also accommodate the additional frailty variable effectively. Asymptotic properties of the NPMLE are established along with semi-parametric efficiency of the NPMLE for the finite-dimensional parameters. The consistency of Bootstrap estimators for the standard errors of the NPMLE is also discussed. We conducted some simulations to illustrate the numerical performance and robustness of the proposed algorithm, which is also applied to the Hepatitis B data. PMID:29527068
Maximum likelihood convolutional decoding (MCD) performance due to system losses
NASA Technical Reports Server (NTRS)
Webster, L.
1976-01-01
A model for predicting the computational performance of a maximum likelihood convolutional decoder (MCD) operating in a noisy carrier reference environment is described. This model is used to develop a subroutine that will be utilized by the Telemetry Analysis Program to compute the MCD bit error rate. When this computational model is averaged over noisy reference phase errors using a high-rate interpolation scheme, the results are found to agree quite favorably with experimental measurements.
Fast and accurate read-out of interferometric optical fiber sensors
NASA Astrophysics Data System (ADS)
Bartholsen, Ingebrigt; Hjelme, Dag R.
2016-03-01
We present results from an evaluation of phase and frequency estimation algorithms for read-out instrumentation of interferometric sensors. Tests on interrogating a micro Fabry-Perot sensor made of semi-spherical stimuli-responsive hydrogel immobilized on a single mode fiber end face, shows that an iterative quadrature demodulation technique (IQDT) implemented on a 32-bit microcontroller unit can achieve an absolute length accuracy of ±50 nm and length change accuracy of ±3 nm using an 80 nm SLED source and a grating spectrometer for interrogation. The mean absolute error for the frequency estimator is a factor 3 larger than the theoretical lower bound for a maximum likelihood estimator. The corresponding factor for the phase estimator is 1.3. The computation time for the IQDT algorithm is reduced by a factor 1000 compared to the full QDT for the same accuracy requirement.
Creel, Scott; Creel, Michael
2009-11-01
1. Sampling error in annual estimates of population size creates two widely recognized problems for the analysis of population growth. First, if sampling error is mistakenly treated as process error, one obtains inflated estimates of the variation in true population trajectories (Staples, Taper & Dennis 2004). Second, treating sampling error as process error is thought to overestimate the importance of density dependence in population growth (Viljugrein et al. 2005; Dennis et al. 2006). 2. In ecology, state-space models are used to account for sampling error when estimating the effects of density and other variables on population growth (Staples et al. 2004; Dennis et al. 2006). In econometrics, regression with instrumental variables is a well-established method that addresses the problem of correlation between regressors and the error term, but requires fewer assumptions than state-space models (Davidson & MacKinnon 1993; Cameron & Trivedi 2005). 3. We used instrumental variables to account for sampling error and fit a generalized linear model to 472 annual observations of population size for 35 Elk Management Units in Montana, from 1928 to 2004. We compared this model with state-space models fit with the likelihood function of Dennis et al. (2006). We discuss the general advantages and disadvantages of each method. Briefly, regression with instrumental variables is valid with fewer distributional assumptions, but state-space models are more efficient when their distributional assumptions are met. 4. Both methods found that population growth was negatively related to population density and winter snow accumulation. Summer rainfall and wolf (Canis lupus) presence had much weaker effects on elk (Cervus elaphus) dynamics [though limitation by wolves is strong in some elk populations with well-established wolf populations (Creel et al. 2007; Creel & Christianson 2008)]. 5. Coupled with predictions for Montana from global and regional climate models, our results predict a substantial reduction in the limiting effect of snow accumulation on Montana elk populations in the coming decades. If other limiting factors do not operate with greater force, population growth rates would increase substantially.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate were considered. These equations suggest certain successive approximations iterative procedures for obtaining maximum likelihood estimates. The procedures, which are generalized steepest ascent (deflected gradient) procedures, contain those of Hosmer as a special case.
NASA Astrophysics Data System (ADS)
Ran, Youhua; Li, Xin; Jin, Rui; Kang, Jian; Cosh, Michael H.
2017-01-01
Monitoring and estimating grid-mean soil moisture is very important for assessing many hydrological, biological, and biogeochemical processes and for validating remotely sensed surface soil moisture products. Temporal stability analysis (TSA) is a valuable tool for identifying a small number of representative sampling points to estimate the grid-mean soil moisture content. This analysis was evaluated and improved using high-quality surface soil moisture data that were acquired by a wireless sensor network in a high-intensity irrigated agricultural landscape in an arid region of northwestern China. The performance of the TSA was limited in areas where the representative error was dominated by random events, such as irrigation events. This shortcoming can be effectively mitigated by using a stratified TSA (STSA) method, proposed in this paper. In addition, the following methods were proposed for rapidly and efficiently identifying representative sampling points when using TSA. (1) Instantaneous measurements can be used to identify representative sampling points to some extent; however, the error resulting from this method is significant when validating remotely sensed soil moisture products. Thus, additional representative sampling points should be considered to reduce this error. (2) The calibration period can be determined from the time span of the full range of the grid-mean soil moisture content during the monitoring period. (3) The representative error is sensitive to the number of calibration sampling points, especially when only a few representative sampling points are used. Multiple sampling points are recommended to reduce data loss and improve the likelihood of representativeness at two scales.
Holden, Richard J.; Patel, Neal R.; Scanlon, Matthew C.; Shalaby, Theresa M.; Arnold, Judi M.; Karsh, Ben-Tzion
2009-01-01
Background Pharmacy workload is a modifiable work system factor believed to affect both medication safety outcomes and employee outcomes such as job satisfaction. Objectives This study sought to measure the effect of workload on safety and employee outcomes in two pediatric hospitals and to do so using a novel approach to pharmacy workload measurement. Methods Rather than measuring prescription volume or other similar indicators, this study measured the type and intensity of mental demands experienced during the medication dispensing tasks. The effects of external (interruptions, divided attention, rushing) and internal (concentration, effort) task demands on perceived medication error likelihood, adverse drug event likelihood, job dissatisfaction, and burnout were statistically estimated using multiple linear and logistic regression. Results Pharmacists and pharmacy technicians reported high levels of external and internal mental demands during dispensing. The study supported the hypothesis that external demands (interruptions, divided attention, rushing) negatively impacted medication safety and employee well being outcomes. However, as hypothesized, increasing levels of internal demands (concentration and effort) were not associated with greater perceived likelihood of error, adverse drug events, or burnout, and even had a positive effect on job satisfaction. Conclusion Replicating a prior study in nursing, this study shows that new conceptualizations and measures of workload can generate important new findings about both detrimental and beneficial effects of workload on patient safety and employee well being. This study discusses what those findings imply for policy, management, and design concerning automation, cognition, and staffing. PMID:21111387
Holden, Richard J; Patel, Neal R; Scanlon, Matthew C; Shalaby, Theresa M; Arnold, Judi M; Karsh, Ben-Tzion
2010-12-01
Pharmacy workload is a modifiable work system factor believed to affect both medication safety outcomes and employee outcomes, such as job satisfaction. This study sought to measure the effect of workload on safety and employee outcomes in 2 pediatric hospitals and to do so using a novel approach to pharmacy workload measurement. Rather than measuring prescription volume or other similar indicators, this study measured the type and intensity of mental demands experienced during the medication dispensing tasks. The effects of external (interruptions, divided attention, and rushing) and internal (concentration and effort) task demands on perceived medication error likelihood, adverse drug event likelihood, job dissatisfaction, and burnout were statistically estimated using multiple linear and logistic regression. Pharmacists and pharmacy technicians reported high levels of external and internal mental demands during dispensing. The study supported the hypothesis that external demands (interruptions, divided attention, and rushing) negatively impacted medication safety and employee well-being outcomes. However, as hypothesized, increasing levels of internal demands (concentration and effort) were not associated with greater perceived likelihood of error, adverse drug events, or burnout and even had a positive effect on job satisfaction. Replicating a prior study in nursing, this study shows that new conceptualizations and measures of workload can generate important new findings about both detrimental and beneficial effects of workload on patient safety and employee well-being. This study discusses what those findings imply for policy, management, and design concerning automation, cognition, and staffing. Copyright © 2010 Elsevier Inc. All rights reserved.
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.
NASA Technical Reports Server (NTRS)
Murphy, Patrick Charles
1985-01-01
An algorithm for maximum likelihood (ML) estimation is developed with an efficient method for approximating the sensitivities. The algorithm was developed for airplane parameter estimation problems but is well suited for most nonlinear, multivariable, dynamic systems. The ML algorithm relies on a new optimization method referred to as a modified Newton-Raphson with estimated sensitivities (MNRES). MNRES determines sensitivities by using slope information from local surface approximations of each output variable in parameter space. The fitted surface allows sensitivity information to be updated at each iteration with a significant reduction in computational effort. MNRES determines the sensitivities with less computational effort than using either a finite-difference method or integrating the analytically determined sensitivity equations. MNRES eliminates the need to derive sensitivity equations for each new model, thus eliminating algorithm reformulation with each new model and providing flexibility to use model equations in any format that is convenient. A random search technique for determining the confidence limits of ML parameter estimates is applied to nonlinear estimation problems for airplanes. The confidence intervals obtained by the search are compared with Cramer-Rao (CR) bounds at the same confidence level. It is observed that the degree of nonlinearity in the estimation problem is an important factor in the relationship between CR bounds and the error bounds determined by the search technique. The CR bounds were found to be close to the bounds determined by the search when the degree of nonlinearity was small. Beale's measure of nonlinearity is developed in this study for airplane identification problems; it is used to empirically correct confidence levels for the parameter confidence limits. The primary utility of the measure, however, was found to be in predicting the degree of agreement between Cramer-Rao bounds and search estimates.
A comparison of three approaches to non-stationary flood frequency analysis
NASA Astrophysics Data System (ADS)
Debele, S. E.; Strupczewski, W. G.; Bogdanowicz, E.
2017-08-01
Non-stationary flood frequency analysis (FFA) is applied to statistical analysis of seasonal flow maxima from Polish and Norwegian catchments. Three non-stationary estimation methods, namely, maximum likelihood (ML), two stage (WLS/TS) and GAMLSS (generalized additive model for location, scale and shape parameters), are compared in the context of capturing the effect of non-stationarity on the estimation of time-dependent moments and design quantiles. The use of a multimodel approach is recommended, to reduce the errors due to the model misspecification in the magnitude of quantiles. The results of calculations based on observed seasonal daily flow maxima and computer simulation experiments showed that GAMLSS gave the best results with respect to the relative bias and root mean square error in the estimates of trend in the standard deviation and the constant shape parameter, while WLS/TS provided better accuracy in the estimates of trend in the mean value. Within three compared methods the WLS/TS method is recommended to deal with non-stationarity in short time series. Some practical aspects of the GAMLSS package application are also presented. The detailed discussion of general issues related to consequences of climate change in the FFA is presented in the second part of the article entitled "Around and about an application of the GAMLSS package in non-stationary flood frequency analysis".
High-Order Model and Dynamic Filtering for Frame Rate Up-Conversion.
Bao, Wenbo; Zhang, Xiaoyun; Chen, Li; Ding, Lianghui; Gao, Zhiyong
2018-08-01
This paper proposes a novel frame rate up-conversion method through high-order model and dynamic filtering (HOMDF) for video pixels. Unlike the constant brightness and linear motion assumptions in traditional methods, the intensity and position of the video pixels are both modeled with high-order polynomials in terms of time. Then, the key problem of our method is to estimate the polynomial coefficients that represent the pixel's intensity variation, velocity, and acceleration. We propose to solve it with two energy objectives: one minimizes the auto-regressive prediction error of intensity variation by its past samples, and the other minimizes video frame's reconstruction error along the motion trajectory. To efficiently address the optimization problem for these coefficients, we propose the dynamic filtering solution inspired by video's temporal coherence. The optimal estimation of these coefficients is reformulated into a dynamic fusion of the prior estimate from pixel's temporal predecessor and the maximum likelihood estimate from current new observation. Finally, frame rate up-conversion is implemented using motion-compensated interpolation by pixel-wise intensity variation and motion trajectory. Benefited from the advanced model and dynamic filtering, the interpolated frame has much better visual quality. Extensive experiments on the natural and synthesized videos demonstrate the superiority of HOMDF over the state-of-the-art methods in both subjective and objective comparisons.
Error-Rate Bounds for Coded PPM on a Poisson Channel
NASA Technical Reports Server (NTRS)
Moision, Bruce; Hamkins, Jon
2009-01-01
Equations for computing tight bounds on error rates for coded pulse-position modulation (PPM) on a Poisson channel at high signal-to-noise ratio have been derived. These equations and elements of the underlying theory are expected to be especially useful in designing codes for PPM optical communication systems. The equations and the underlying theory apply, more specifically, to a case in which a) At the transmitter, a linear outer code is concatenated with an inner code that includes an accumulator and a bit-to-PPM-symbol mapping (see figure) [this concatenation is known in the art as "accumulate-PPM" (abbreviated "APPM")]; b) The transmitted signal propagates on a memoryless binary-input Poisson channel; and c) At the receiver, near-maximum-likelihood (ML) decoding is effected through an iterative process. Such a coding/modulation/decoding scheme is a variation on the concept of turbo codes, which have complex structures, such that an exact analytical expression for the performance of a particular code is intractable. However, techniques for accurately estimating the performances of turbo codes have been developed. The performance of a typical turbo code includes (1) a "waterfall" region consisting of a steep decrease of error rate with increasing signal-to-noise ratio (SNR) at low to moderate SNR, and (2) an "error floor" region with a less steep decrease of error rate with increasing SNR at moderate to high SNR. The techniques used heretofore for estimating performance in the waterfall region have differed from those used for estimating performance in the error-floor region. For coded PPM, prior to the present derivations, equations for accurate prediction of the performance of coded PPM at high SNR did not exist, so that it was necessary to resort to time-consuming simulations in order to make such predictions. The present derivation makes it unnecessary to perform such time-consuming simulations.
Doi, Suhail A R; Barendregt, Jan J; Khan, Shahjahan; Thalib, Lukman; Williams, Gail M
2015-11-01
This article examines an improved alternative to the random effects (RE) model for meta-analysis of heterogeneous studies. It is shown that the known issues of underestimation of the statistical error and spuriously overconfident estimates with the RE model can be resolved by the use of an estimator under the fixed effect model assumption with a quasi-likelihood based variance structure - the IVhet model. Extensive simulations confirm that this estimator retains a correct coverage probability and a lower observed variance than the RE model estimator, regardless of heterogeneity. When the proposed IVhet method is applied to the controversial meta-analysis of intravenous magnesium for the prevention of mortality after myocardial infarction, the pooled OR is 1.01 (95% CI 0.71-1.46) which not only favors the larger studies but also indicates more uncertainty around the point estimate. In comparison, under the RE model the pooled OR is 0.71 (95% CI 0.57-0.89) which, given the simulation results, reflects underestimation of the statistical error. Given the compelling evidence generated, we recommend that the IVhet model replace both the FE and RE models. To facilitate this, it has been implemented into free meta-analysis software called MetaXL which can be downloaded from www.epigear.com. Copyright © 2015 Elsevier Inc. All rights reserved.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
NASA Astrophysics Data System (ADS)
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
A comparative study of optimum and suboptimum direct-detection laser ranging receivers
NASA Technical Reports Server (NTRS)
Abshire, J. B.
1978-01-01
A summary of previously proposed receiver strategies for direct-detection laser ranging receivers is presented. Computer simulations are used to compare performance of candidate implementation strategies in the 1- to 100-photoelectron region. Under the condition of no background radiation, the maximum-likelihood and minimum mean-square error estimators were found to give the same performance for both bell-shaped and rectangular optical-pulse shapes. For signal energies greater than 100 photoelectrons, the root-mean-square range error is shown to decrease as Q to the -1/2 power for bell-shaped pulses and Q to the -1 power for rectangular pulses, where Q represents the average pulse energy. Of several receiver implementations presented, the matched-filter peak detector was found to be preferable. A similar configuration, using a constant-fraction discriminator, exhibited a signal-level dependent time bias.
TDRS orbit determination by radio interferometry
NASA Technical Reports Server (NTRS)
Pavloff, Michael S.
1994-01-01
In support of a NASA study on the application of radio interferometry to satellite orbit determination, MITRE developed a simulation tool for assessing interferometry tracking accuracy. The Orbit Determination Accuracy Estimator (ODAE) models the general batch maximum likelihood orbit determination algorithms of the Goddard Trajectory Determination System (GTDS) with the group and phase delay measurements from radio interferometry. ODAE models the statistical properties of tracking error sources, including inherent observable imprecision, atmospheric delays, clock offsets, station location uncertainty, and measurement biases, and through Monte Carlo simulation, ODAE calculates the statistical properties of errors in the predicted satellites state vector. This paper presents results from ODAE application to orbit determination of the Tracking and Data Relay Satellite (TDRS) by radio interferometry. Conclusions about optimal ground station locations for interferometric tracking of TDRS are presented, along with a discussion of operational advantages of radio interferometry.
NASA Astrophysics Data System (ADS)
Lin, Tsungpo
Performance engineers face the major challenge in modeling and simulation for the after-market power system due to system degradation and measurement errors. Currently, the majority in power generation industries utilizes the deterministic data matching method to calibrate the model and cascade system degradation, which causes significant calibration uncertainty and also the risk of providing performance guarantees. In this research work, a maximum-likelihood based simultaneous data reconciliation and model calibration (SDRMC) is used for power system modeling and simulation. By replacing the current deterministic data matching with SDRMC one can reduce the calibration uncertainty and mitigate the error propagation to the performance simulation. A modeling and simulation environment for a complex power system with certain degradation has been developed. In this environment multiple data sets are imported when carrying out simultaneous data reconciliation and model calibration. Calibration uncertainties are estimated through error analyses and populated to performance simulation by using principle of error propagation. System degradation is then quantified by performance comparison between the calibrated model and its expected new & clean status. To mitigate smearing effects caused by gross errors, gross error detection (GED) is carried out in two stages. The first stage is a screening stage, in which serious gross errors are eliminated in advance. The GED techniques used in the screening stage are based on multivariate data analysis (MDA), including multivariate data visualization and principal component analysis (PCA). Subtle gross errors are treated at the second stage, in which the serial bias compensation or robust M-estimator is engaged. To achieve a better efficiency in the combined scheme of the least squares based data reconciliation and the GED technique based on hypotheses testing, the Levenberg-Marquardt (LM) algorithm is utilized as the optimizer. To reduce the computation time and stabilize the problem solving for a complex power system such as a combined cycle power plant, meta-modeling using the response surface equation (RSE) and system/process decomposition are incorporated with the simultaneous scheme of SDRMC. The goal of this research work is to reduce the calibration uncertainties and, thus, the risks of providing performance guarantees arisen from uncertainties in performance simulation.
Jeon, Jihyoun; Hsu, Li; Gorfine, Malka
2012-07-01
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Grinband, Jack; Savitskaya, Judith; Wager, Tor D; Teichert, Tobias; Ferrera, Vincent P; Hirsch, Joy
2011-07-15
The dorsal medial frontal cortex (dMFC) is highly active during choice behavior. Though many models have been proposed to explain dMFC function, the conflict monitoring model is the most influential. It posits that dMFC is primarily involved in detecting interference between competing responses thus signaling the need for control. It accurately predicts increased neural activity and response time (RT) for incompatible (high-interference) vs. compatible (low-interference) decisions. However, it has been shown that neural activity can increase with time on task, even when no decisions are made. Thus, the greater dMFC activity on incompatible trials may stem from longer RTs rather than response conflict. This study shows that (1) the conflict monitoring model fails to predict the relationship between error likelihood and RT, and (2) the dMFC activity is not sensitive to congruency, error likelihood, or response conflict, but is monotonically related to time on task. Copyright © 2010 Elsevier Inc. All rights reserved.
Simulating the effect of non-linear mode coupling in cosmological parameter estimation
NASA Astrophysics Data System (ADS)
Kiessling, A.; Taylor, A. N.; Heavens, A. F.
2011-09-01
Fisher Information Matrix methods are commonly used in cosmology to estimate the accuracy that cosmological parameters can be measured with a given experiment and to optimize the design of experiments. However, the standard approach usually assumes both data and parameter estimates are Gaussian-distributed. Further, for survey forecasts and optimization it is usually assumed that the power-spectrum covariance matrix is diagonal in Fourier space. However, in the low-redshift Universe, non-linear mode coupling will tend to correlate small-scale power, moving information from lower to higher order moments of the field. This movement of information will change the predictions of cosmological parameter accuracy. In this paper we quantify this loss of information by comparing naïve Gaussian Fisher matrix forecasts with a maximum likelihood parameter estimation analysis of a suite of mock weak lensing catalogues derived from N-body simulations, based on the SUNGLASS pipeline, for a 2D and tomographic shear analysis of a Euclid-like survey. In both cases, we find that the 68 per cent confidence area of the Ωm-σ8 plane increases by a factor of 5. However, the marginal errors increase by just 20-40 per cent. We propose a new method to model the effects of non-linear shear-power mode coupling in the Fisher matrix by approximating the shear-power distribution as a multivariate Gaussian with a covariance matrix derived from the mock weak lensing survey. We find that this approximation can reproduce the 68 per cent confidence regions of the full maximum likelihood analysis in the Ωm-σ8 plane to high accuracy for both 2D and tomographic weak lensing surveys. Finally, we perform a multiparameter analysis of Ωm, σ8, h, ns, w0 and wa to compare the Gaussian and non-linear mode-coupled Fisher matrix contours. The 6D volume of the 1σ error contours for the non-linear Fisher analysis is a factor of 3 larger than for the Gaussian case, and the shape of the 68 per cent confidence volume is modified. We propose that future Fisher matrix estimates of cosmological parameter accuracies should include mode-coupling effects.
A fast least-squares algorithm for population inference
2013-01-01
Background Population inference is an important problem in genetics used to remove population stratification in genome-wide association studies and to detect migration patterns or shared ancestry. An individual’s genotype can be modeled as a probabilistic function of ancestral population memberships, Q, and the allele frequencies in those populations, P. The parameters, P and Q, of this binomial likelihood model can be inferred using slow sampling methods such as Markov Chain Monte Carlo methods or faster gradient based approaches such as sequential quadratic programming. This paper proposes a least-squares simplification of the binomial likelihood model motivated by a Euclidean interpretation of the genotype feature space. This results in a faster algorithm that easily incorporates the degree of admixture within the sample of individuals and improves estimates without requiring trial-and-error tuning. Results We show that the expected value of the least-squares solution across all possible genotype datasets is equal to the true solution when part of the problem has been solved, and that the variance of the solution approaches zero as its size increases. The Least-squares algorithm performs nearly as well as Admixture for these theoretical scenarios. We compare least-squares, Admixture, and FRAPPE for a variety of problem sizes and difficulties. For particularly hard problems with a large number of populations, small number of samples, or greater degree of admixture, least-squares performs better than the other methods. On simulated mixtures of real population allele frequencies from the HapMap project, Admixture estimates sparsely mixed individuals better than Least-squares. The least-squares approach, however, performs within 1.5% of the Admixture error. On individual genotypes from the HapMap project, Admixture and least-squares perform qualitatively similarly and within 1.2% of each other. Significantly, the least-squares approach nearly always converges 1.5- to 6-times faster. Conclusions The computational advantage of the least-squares approach along with its good estimation performance warrants further research, especially for very large datasets. As problem sizes increase, the difference in estimation performance between all algorithms decreases. In addition, when prior information is known, the least-squares approach easily incorporates the expected degree of admixture to improve the estimate. PMID:23343408
A fast least-squares algorithm for population inference.
Parry, R Mitchell; Wang, May D
2013-01-23
Population inference is an important problem in genetics used to remove population stratification in genome-wide association studies and to detect migration patterns or shared ancestry. An individual's genotype can be modeled as a probabilistic function of ancestral population memberships, Q, and the allele frequencies in those populations, P. The parameters, P and Q, of this binomial likelihood model can be inferred using slow sampling methods such as Markov Chain Monte Carlo methods or faster gradient based approaches such as sequential quadratic programming. This paper proposes a least-squares simplification of the binomial likelihood model motivated by a Euclidean interpretation of the genotype feature space. This results in a faster algorithm that easily incorporates the degree of admixture within the sample of individuals and improves estimates without requiring trial-and-error tuning. We show that the expected value of the least-squares solution across all possible genotype datasets is equal to the true solution when part of the problem has been solved, and that the variance of the solution approaches zero as its size increases. The Least-squares algorithm performs nearly as well as Admixture for these theoretical scenarios. We compare least-squares, Admixture, and FRAPPE for a variety of problem sizes and difficulties. For particularly hard problems with a large number of populations, small number of samples, or greater degree of admixture, least-squares performs better than the other methods. On simulated mixtures of real population allele frequencies from the HapMap project, Admixture estimates sparsely mixed individuals better than Least-squares. The least-squares approach, however, performs within 1.5% of the Admixture error. On individual genotypes from the HapMap project, Admixture and least-squares perform qualitatively similarly and within 1.2% of each other. Significantly, the least-squares approach nearly always converges 1.5- to 6-times faster. The computational advantage of the least-squares approach along with its good estimation performance warrants further research, especially for very large datasets. As problem sizes increase, the difference in estimation performance between all algorithms decreases. In addition, when prior information is known, the least-squares approach easily incorporates the expected degree of admixture to improve the estimate.
Maximum entropy approach to statistical inference for an ocean acoustic waveguide.
Knobles, D P; Sagers, J D; Koch, R A
2012-02-01
A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America
NASA Astrophysics Data System (ADS)
Hui, Z.; Cheng, P.; Ziggah, Y. Y.; Nie, Y.
2018-04-01
Filtering is a key step for most applications of airborne LiDAR point clouds. Although lots of filtering algorithms have been put forward in recent years, most of them suffer from parameters setting or thresholds adjusting, which will be time-consuming and reduce the degree of automation of the algorithm. To overcome this problem, this paper proposed a threshold-free filtering algorithm based on expectation-maximization. The proposed algorithm is developed based on an assumption that point clouds are seen as a mixture of Gaussian models. The separation of ground points and non-ground points from point clouds can be replaced as a separation of a mixed Gaussian model. Expectation-maximization (EM) is applied for realizing the separation. EM is used to calculate maximum likelihood estimates of the mixture parameters. Using the estimated parameters, the likelihoods of each point belonging to ground or object can be computed. After several iterations, point clouds can be labelled as the component with a larger likelihood. Furthermore, intensity information was also utilized to optimize the filtering results acquired using the EM method. The proposed algorithm was tested using two different datasets used in practice. Experimental results showed that the proposed method can filter non-ground points effectively. To quantitatively evaluate the proposed method, this paper adopted the dataset provided by the ISPRS for the test. The proposed algorithm can obtain a 4.48 % total error which is much lower than most of the eight classical filtering algorithms reported by the ISPRS.
Likelihood ratio meta-analysis: New motivation and approach for an old method.
Dormuth, Colin R; Filion, Kristian B; Platt, Robert W
2016-03-01
A 95% confidence interval (CI) in an updated meta-analysis may not have the expected 95% coverage. If a meta-analysis is simply updated with additional data, then the resulting 95% CI will be wrong because it will not have accounted for the fact that the earlier meta-analysis failed or succeeded to exclude the null. This situation can be avoided by using the likelihood ratio (LR) as a measure of evidence that does not depend on type-1 error. We show how an LR-based approach, first advanced by Goodman, can be used in a meta-analysis to pool data from separate studies to quantitatively assess where the total evidence points. The method works by estimating the log-likelihood ratio (LogLR) function from each study. Those functions are then summed to obtain a combined function, which is then used to retrieve the total effect estimate, and a corresponding 'intrinsic' confidence interval. Using as illustrations the CAPRIE trial of clopidogrel versus aspirin in the prevention of ischemic events, and our own meta-analysis of higher potency statins and the risk of acute kidney injury, we show that the LR-based method yields the same point estimate as the traditional analysis, but with an intrinsic confidence interval that is appropriately wider than the traditional 95% CI. The LR-based method can be used to conduct both fixed effect and random effects meta-analyses, it can be applied to old and new meta-analyses alike, and results can be presented in a format that is familiar to a meta-analytic audience. Copyright © 2016 Elsevier Inc. All rights reserved.
LaCroix, Arianna N; Diaz, Alvaro F; Rogalsky, Corianne
2015-01-01
The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music.
Klim, Søren; Mortensen, Stig Bousgaard; Kristensen, Niels Rode; Overgaard, Rune Viig; Madsen, Henrik
2009-06-01
The extension from ordinary to stochastic differential equations (SDEs) in pharmacokinetic and pharmacodynamic (PK/PD) modelling is an emerging field and has been motivated in a number of articles [N.R. Kristensen, H. Madsen, S.H. Ingwersen, Using stochastic differential equations for PK/PD model development, J. Pharmacokinet. Pharmacodyn. 32 (February(1)) (2005) 109-141; C.W. Tornøe, R.V. Overgaard, H. Agersø, H.A. Nielsen, H. Madsen, E.N. Jonsson, Stochastic differential equations in NONMEM: implementation, application, and comparison with ordinary differential equations, Pharm. Res. 22 (August(8)) (2005) 1247-1258; R.V. Overgaard, N. Jonsson, C.W. Tornøe, H. Madsen, Non-linear mixed-effects models with stochastic differential equations: implementation of an estimation algorithm, J. Pharmacokinet. Pharmacodyn. 32 (February(1)) (2005) 85-107; U. Picchini, S. Ditlevsen, A. De Gaetano, Maximum likelihood estimation of a time-inhomogeneous stochastic differential model of glucose dynamics, Math. Med. Biol. 25 (June(2)) (2008) 141-155]. PK/PD models are traditionally based ordinary differential equations (ODEs) with an observation link that incorporates noise. This state-space formulation only allows for observation noise and not for system noise. Extending to SDEs allows for a Wiener noise component in the system equations. This additional noise component enables handling of autocorrelated residuals originating from natural variation or systematic model error. Autocorrelated residuals are often partly ignored in PK/PD modelling although violating the hypothesis for many standard statistical tests. This article presents a package for the statistical program R that is able to handle SDEs in a mixed-effects setting. The estimation method implemented is the FOCE(1) approximation to the population likelihood which is generated from the individual likelihoods that are approximated using the Extended Kalman Filter's one-step predictions.
LaCroix, Arianna N.; Diaz, Alvaro F.; Rogalsky, Corianne
2015-01-01
The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music. PMID:26321976
Kim, Kyungsoo; Lim, Sung-Ho; Lee, Jaeseok; Kang, Won-Seok; Moon, Cheil; Choi, Ji-Woong
2016-01-01
Electroencephalograms (EEGs) measure a brain signal that contains abundant information about the human brain function and health. For this reason, recent clinical brain research and brain computer interface (BCI) studies use EEG signals in many applications. Due to the significant noise in EEG traces, signal processing to enhance the signal to noise power ratio (SNR) is necessary for EEG analysis, especially for non-invasive EEG. A typical method to improve the SNR is averaging many trials of event related potential (ERP) signal that represents a brain’s response to a particular stimulus or a task. The averaging, however, is very sensitive to variable delays. In this study, we propose two time delay estimation (TDE) schemes based on a joint maximum likelihood (ML) criterion to compensate the uncertain delays which may be different in each trial. We evaluate the performance for different types of signals such as random, deterministic, and real EEG signals. The results show that the proposed schemes provide better performance than other conventional schemes employing averaged signal as a reference, e.g., up to 4 dB gain at the expected delay error of 10°. PMID:27322267
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.
Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu
2015-06-01
High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Estimating population diversity with CatchAll
Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.
2012-01-01
Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246
NASA Astrophysics Data System (ADS)
Amini, Changeez; Taherpour, Abbas; Khattab, Tamer; Gazor, Saeed
2017-01-01
This paper presents an improved propagation channel model for the visible light in indoor environments. We employ this model to derive an enhanced positioning algorithm using on the relation between the time-of-arrivals (TOAs) and the distances for two cases either by assuming known or unknown transmitter and receiver vertical distances. We propose two estimators, namely the maximum likelihood estimator and an estimator by employing the method of moments. To have an evaluation basis for these methods, we calculate the Cramer-Rao lower bound (CRLB) for the performance of the estimations. We show that the proposed model and estimations result in a superior performance in positioning when the transmitter and receiver are perfectly synchronized in comparison to the existing state-of-the-art counterparts. Moreover, the corresponding CRLB of the proposed model represents almost about 20 dB reduction in the localization error bound in comparison with the previous model for some practical scenarios.
On-Orbit Multi-Field Wavefront Control with a Kalman Filter
NASA Technical Reports Server (NTRS)
Lou, John; Sigrist, Norbert; Basinger, Scott; Redding, David
2008-01-01
A document describes a multi-field wavefront control (WFC) procedure for the James Webb Space Telescope (JWST) on-orbit optical telescope element (OTE) fine-phasing using wavefront measurements at the NIRCam pupil. The control is applied to JWST primary mirror (PM) segments and secondary mirror (SM) simultaneously with a carefully selected ordering. Through computer simulations, the multi-field WFC procedure shows that it can reduce the initial system wavefront error (WFE), as caused by random initial system misalignments within the JWST fine-phasing error budget, from a few dozen micrometers to below 50 nm across the entire NIRCam Field of View, and the WFC procedure is also computationally stable as the Monte-Carlo simulations indicate. With the incorporation of a Kalman Filter (KF) as an optical state estimator into the WFC process, the robustness of the JWST OTE alignment process can be further improved. In the presence of some large optical misalignments, the Kalman state estimator can provide a reasonable estimate of the optical state, especially for those degrees of freedom that have a significant impact on the system WFE. The state estimate allows for a few corrections to the optical state to push the system towards its nominal state, and the result is that a large part of the WFE can be eliminated in this step. When the multi-field WFC procedure is applied after Kalman state estimate and correction, the stability of fine-phasing control is much more certain. Kalman Filter has been successfully applied to diverse applications as a robust and optimal state estimator. In the context of space-based optical system alignment based on wavefront measurements, a KF state estimator can combine all available wavefront measurements, past and present, as well as measurement and actuation error statistics to generate a Maximum-Likelihood optimal state estimator. The strength and flexibility of the KF algorithm make it attractive for use in real-time optical system alignment when WFC alone cannot effectively align the system.
A new LDPC decoding scheme for PDM-8QAM BICM coherent optical communication system
NASA Astrophysics Data System (ADS)
Liu, Yi; Zhang, Wen-bo; Xi, Li-xia; Tang, Xian-feng; Zhang, Xiao-guang
2015-11-01
A new log-likelihood ratio (LLR) message estimation method is proposed for polarization-division multiplexing eight quadrature amplitude modulation (PDM-8QAM) bit-interleaved coded modulation (BICM) optical communication system. The formulation of the posterior probability is theoretically analyzed, and the way to reduce the pre-decoding bit error rate ( BER) of the low density parity check (LDPC) decoder for PDM-8QAM constellations is presented. Simulation results show that it outperforms the traditional scheme, i.e., the new post-decoding BER is decreased down to 50% of that of the traditional post-decoding algorithm.
Adaptive OFDM Radar Waveform Design for Improved Micro-Doppler Estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sen, Satyabrata
Here we analyze the performance of a wideband orthogonal frequency division multiplexing (OFDM) signal in estimating the micro-Doppler frequency of a rotating target having multiple scattering centers. The use of a frequency-diverse OFDM signal enables us to independently analyze the micro-Doppler characteristics with respect to a set of orthogonal subcarrier frequencies. We characterize the accuracy of micro-Doppler frequency estimation by computing the Cramer-Rao bound (CRB) on the angular-velocity estimate of the target. Additionally, to improve the accuracy of the estimation procedure, we formulate and solve an optimization problem by minimizing the CRB on the angular-velocity estimate with respect to themore » OFDM spectral coefficients. We present several numerical examples to demonstrate the CRB variations with respect to the signal-to-noise ratios, number of temporal samples, and number of OFDM subcarriers. We also analysed numerically the improvement in estimation accuracy due to the adaptive waveform design. A grid-based maximum likelihood estimation technique is applied to evaluate the corresponding mean-squared error performance.« less
Curiale, Ariel H; Vegas-Sánchez-Ferrero, Gonzalo; Bosch, Johan G; Aja-Fernández, Santiago
2015-08-01
The strain and strain-rate measures are commonly used for the analysis and assessment of regional myocardial function. In echocardiography (EC), the strain analysis became possible using Tissue Doppler Imaging (TDI). Unfortunately, this modality shows an important limitation: the angle between the myocardial movement and the ultrasound beam should be small to provide reliable measures. This constraint makes it difficult to provide strain measures of the entire myocardium. Alternative non-Doppler techniques such as Speckle Tracking (ST) can provide strain measures without angle constraints. However, the spatial resolution and the noisy appearance of speckle still make the strain estimation a challenging task in EC. Several maximum likelihood approaches have been proposed to statistically characterize the behavior of speckle, which results in a better performance of speckle tracking. However, those models do not consider common transformations to achieve the final B-mode image (e.g. interpolation). This paper proposes a new maximum likelihood approach for speckle tracking which effectively characterizes speckle of the final B-mode image. Its formulation provides a diffeomorphic scheme than can be efficiently optimized with a second-order method. The novelty of the method is threefold: First, the statistical characterization of speckle generalizes conventional speckle models (Rayleigh, Nakagami and Gamma) to a more versatile model for real data. Second, the formulation includes local correlation to increase the efficiency of frame-to-frame speckle tracking. Third, a probabilistic myocardial tissue characterization is used to automatically identify more reliable myocardial motions. The accuracy and agreement assessment was evaluated on a set of 16 synthetic image sequences for three different scenarios: normal, acute ischemia and acute dyssynchrony. The proposed method was compared to six speckle tracking methods. Results revealed that the proposed method is the most accurate method to measure the motion and strain with an average median motion error of 0.42 mm and a median strain error of 2.0 ± 0.9%, 2.1 ± 1.3% and 7.1 ± 4.9% for circumferential, longitudinal and radial strain respectively. It also showed its capability to identify abnormal segments with reduced cardiac function and timing differences for the dyssynchrony cases. These results indicate that the proposed diffeomorphic speckle tracking method provides robust and accurate motion and strain estimation. Copyright © 2015. Published by Elsevier B.V.
On the error probability of general tree and trellis codes with applications to sequential decoding
NASA Technical Reports Server (NTRS)
Johannesson, R.
1973-01-01
An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random binary tree codes is derived and shown to be independent of the length of the tree. An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random L-branch binary trellis codes of rate R = 1/n is derived which separates the effects of the tail length T and the memory length M of the code. It is shown that the bound is independent of the length L of the information sequence. This implication is investigated by computer simulations of sequential decoding utilizing the stack algorithm. These simulations confirm the implication and further suggest an empirical formula for the true undetected decoding error probability with sequential decoding.
Human factors process failure modes and effects analysis (HF PFMEA) software tool
NASA Technical Reports Server (NTRS)
Chandler, Faith T. (Inventor); Relvini, Kristine M. (Inventor); Shedd, Nathaneal P. (Inventor); Valentino, William D. (Inventor); Philippart, Monica F. (Inventor); Bessette, Colette I. (Inventor)
2011-01-01
Methods, computer-readable media, and systems for automatically performing Human Factors Process Failure Modes and Effects Analysis for a process are provided. At least one task involved in a process is identified, where the task includes at least one human activity. The human activity is described using at least one verb. A human error potentially resulting from the human activity is automatically identified, the human error is related to the verb used in describing the task. A likelihood of occurrence, detection, and correction of the human error is identified. The severity of the effect of the human error is identified. The likelihood of occurrence, and the severity of the risk of potential harm is identified. The risk of potential harm is compared with a risk threshold to identify the appropriateness of corrective measures.
Common mode error in Antarctic GPS coordinate time series on its effect on bedrock-uplift estimates
NASA Astrophysics Data System (ADS)
Liu, Bin; King, Matt; Dai, Wujiao
2018-05-01
Spatially-correlated common mode error always exists in regional, or-larger, GPS networks. We applied independent component analysis (ICA) to GPS vertical coordinate time series in Antarctica from 2010 to 2014 and made a comparison with the principal component analysis (PCA). Using PCA/ICA, the time series can be decomposed into a set of temporal components and their spatial responses. We assume the components with common spatial responses are common mode error (CME). An average reduction of ˜40% about the RMS values was achieved in both PCA and ICA filtering. However, the common mode components obtained from the two approaches have different spatial and temporal features. ICA time series present interesting correlations with modeled atmospheric and non-tidal ocean loading displacements. A white noise (WN) plus power law noise (PL) model was adopted in the GPS velocity estimation using maximum likelihood estimation (MLE) analysis, with ˜55% reduction of the velocity uncertainties after filtering using ICA. Meanwhile, spatiotemporal filtering reduces the amplitude of PL and periodic terms in the GPS time series. Finally, we compare the GPS uplift velocities, after correction for elastic effects, with recent models of glacial isostatic adjustment (GIA). The agreements of the GPS observed velocities and four GIA models are generally improved after the spatiotemporal filtering, with a mean reduction of ˜0.9 mm/yr of the WRMS values, possibly allowing for more confident separation of various GIA model predictions.
ERIC Educational Resources Information Center
Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S.
2016-01-01
The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
On non-parametric maximum likelihood estimation of the bivariate survivor function.
Prentice, R L
The likelihood function for the bivariate survivor function F, under independent censorship, is maximized to obtain a non-parametric maximum likelihood estimator &Fcirc;. &Fcirc; may or may not be unique depending on the configuration of singly- and doubly-censored pairs. The likelihood function can be maximized by placing all mass on the grid formed by the uncensored failure times, or half lines beyond the failure time grid, or in the upper right quadrant beyond the grid. By accumulating the mass along lines (or regions) where the likelihood is flat, one obtains a partially maximized likelihood as a function of parameters that can be uniquely estimated. The score equations corresponding to these point mass parameters are derived, using a Lagrange multiplier technique to ensure unit total mass, and a modified Newton procedure is used to calculate the parameter estimates in some limited simulation studies. Some considerations for the further development of non-parametric bivariate survivor function estimators are briefly described.
Bernhardt, Paul W; Wang, Huixia Judy; Zhang, Daowen
2014-01-01
Models for survival data generally assume that covariates are fully observed. However, in medical studies it is not uncommon for biomarkers to be censored at known detection limits. A computationally-efficient multiple imputation procedure for modeling survival data with covariates subject to detection limits is proposed. This procedure is developed in the context of an accelerated failure time model with a flexible seminonparametric error distribution. The consistency and asymptotic normality of the multiple imputation estimator are established and a consistent variance estimator is provided. An iterative version of the proposed multiple imputation algorithm that approximates the EM algorithm for maximum likelihood is also suggested. Simulation studies demonstrate that the proposed multiple imputation methods work well while alternative methods lead to estimates that are either biased or more variable. The proposed methods are applied to analyze the dataset from a recently-conducted GenIMS study.
Iterative Code-Aided ML Phase Estimation and Phase Ambiguity Resolution
NASA Astrophysics Data System (ADS)
Wymeersch, Henk; Moeneclaey, Marc
2005-12-01
As many coded systems operate at very low signal-to-noise ratios, synchronization becomes a very difficult task. In many cases, conventional algorithms will either require long training sequences or result in large BER degradations. By exploiting code properties, these problems can be avoided. In this contribution, we present several iterative maximum-likelihood (ML) algorithms for joint carrier phase estimation and ambiguity resolution. These algorithms operate on coded signals by accepting soft information from the MAP decoder. Issues of convergence and initialization are addressed in detail. Simulation results are presented for turbo codes, and are compared to performance results of conventional algorithms. Performance comparisons are carried out in terms of BER performance and mean square estimation error (MSEE). We show that the proposed algorithm reduces the MSEE and, more importantly, the BER degradation. Additionally, phase ambiguity resolution can be performed without resorting to a pilot sequence, thus improving the spectral efficiency.
Autonomous mechanism of internal choice estimate underlies decision inertia.
Akaishi, Rei; Umeda, Kazumasa; Nagase, Asako; Sakai, Katsuyuki
2014-01-08
Our choice is influenced by choices we made in the past, but the mechanism responsible for the choice bias remains elusive. Here we show that the history-dependent choice bias can be explained by an autonomous learning rule whereby an estimate of the likelihood of a choice to be made is updated in each trial by comparing between the actual and expected choices. We found that in perceptual decision making without performance feedback, a decision on an ambiguous stimulus is repeated on the subsequent trial more often than a decision on a salient stimulus. This inertia of decision was not accounted for by biases in motor response, sensory processing, or attention. The posterior cingulate cortex and frontal eye field represent choice prediction error and choice estimate in the learning algorithm, respectively. Interactions between the two regions during the intertrial interval are associated with decision inertia on a subsequent trial. Copyright © 2014 Elsevier Inc. All rights reserved.
Spatial hydrological drought characteristics in Karkheh River basin, southwest Iran using copulas
NASA Astrophysics Data System (ADS)
Dodangeh, Esmaeel; Shahedi, Kaka; Shiau, Jenq-Tzong; MirAkbari, Maryam
2017-08-01
Investigation on drought characteristics such as severity, duration, and frequency is crucial for water resources planning and management in a river basin. While the methodology for multivariate drought frequency analysis is well established by applying the copulas, the estimation on the associated parameters by various parameter estimation methods and the effects on the obtained results have not yet been investigated. This research aims at conducting a comparative analysis between the maximum likelihood parametric and non-parametric method of the Kendall τ estimation method for copulas parameter estimation. The methods were employed to study joint severity-duration probability and recurrence intervals in Karkheh River basin (southwest Iran) which is facing severe water-deficit problems. Daily streamflow data at three hydrological gauging stations (Tang Sazbon, Huleilan and Polchehr) near the Karkheh dam were used to draw flow duration curves (FDC) of these three stations. The Q_{75} index extracted from the FDC were set as threshold level to abstract drought characteristics such as drought duration and severity on the basis of the run theory. Drought duration and severity were separately modeled using the univariate probabilistic distributions and gamma-GEV, LN2-exponential, and LN2-gamma were selected as the best paired drought severity-duration inputs for copulas according to the Akaike Information Criteria (AIC), Kolmogorov-Smirnov and chi-square tests. Archimedean Clayton, Frank, and extreme value Gumbel copulas were employed to construct joint cumulative distribution functions (JCDF) of droughts for each station. Frank copula at Tang Sazbon and Gumbel at Huleilan and Polchehr stations were identified as the best copulas based on the performance evaluation criteria including AIC, BIC, log-likelihood and root mean square error (RMSE) values. Based on the RMSE values, nonparametric Kendall-τ is preferred to the parametric maximum likelihood estimation method. The results showed greater drought return periods by the parametric ML method in comparison to the nonparametric Kendall τ estimation method. The results also showed that stations located in tributaries (Huleilan and Polchehr) have close return periods, while the station along the main river (Tang Sazbon) has the smaller return periods for the drought events with identical drought duration and severity.
Hansson, Lisbeth; Khamis, Harry J
2008-12-01
Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.
NASA Astrophysics Data System (ADS)
Pachhai, S.; Masters, G.; Laske, G.
2017-12-01
Earth's normal-mode spectra are crucial to studying the long wavelength structure of the Earth. Such observations have been used extensively to estimate "splitting coefficients" which, in turn, can be used to determine the three-dimensional velocity and density structure. Most past studies apply a non-linear iterative inversion to estimate the splitting coefficients which requires that the earthquake source is known. However, it is challenging to know the source details, particularly for big events as used in normal-mode analyses. Additionally, the final solution of the non-linear inversion can depend on the choice of damping parameter and starting model. To circumvent the need to know the source, a two-step linear inversion has been developed and successfully applied to many mantle and core sensitive modes. The first step takes combinations of the data from a single event to produce spectra known as "receiver strips". The autoregressive nature of the receiver strips can then be used to estimate the structure coefficients without the need to know the source. Based on this approach, we recently employed a neighborhood algorithm to measure the splitting coefficients for an isolated inner-core sensitive mode (13S2). This approach explores the parameter space efficiently without any need of regularization and finds the structure coefficients which best fit the observed strips. Here, we implement a Bayesian approach to data collected for earthquakes from early 2000 and more recent. This approach combines the data (through likelihood) and prior information to provide rigorous parameter values and their uncertainties for both isolated and coupled modes. The likelihood function is derived from the inferred errors of the receiver strips which allows us to retrieve proper uncertainties. Finally, we apply model selection criteria that balance the trade-offs between fit (likelihood) and model complexity to investigate the degree and type of structure (elastic and anelastic) required to explain the data.
Fully probabilistic seismic source inversion - Part 2: Modelling errors and station covariances
NASA Astrophysics Data System (ADS)
Stähler, Simon C.; Sigloch, Karin
2016-11-01
Seismic source inversion, a central task in seismology, is concerned with the estimation of earthquake source parameters and their uncertainties. Estimating uncertainties is particularly challenging because source inversion is a non-linear problem. In a companion paper, Stähler and Sigloch (2014) developed a method of fully Bayesian inference for source parameters, based on measurements of waveform cross-correlation between broadband, teleseismic body-wave observations and their modelled counterparts. This approach yields not only depth and moment tensor estimates but also source time functions. A prerequisite for Bayesian inference is the proper characterisation of the noise afflicting the measurements, a problem we address here. We show that, for realistic broadband body-wave seismograms, the systematic error due to an incomplete physical model affects waveform misfits more strongly than random, ambient background noise. In this situation, the waveform cross-correlation coefficient CC, or rather its decorrelation D = 1 - CC, performs more robustly as a misfit criterion than ℓp norms, more commonly used as sample-by-sample measures of misfit based on distances between individual time samples. From a set of over 900 user-supervised, deterministic earthquake source solutions treated as a quality-controlled reference, we derive the noise distribution on signal decorrelation D = 1 - CC of the broadband seismogram fits between observed and modelled waveforms. The noise on D is found to approximately follow a log-normal distribution, a fortunate fact that readily accommodates the formulation of an empirical likelihood function for D for our multivariate problem. The first and second moments of this multivariate distribution are shown to depend mostly on the signal-to-noise ratio (SNR) of the CC measurements and on the back-azimuthal distances of seismic stations. By identifying and quantifying this likelihood function, we make D and thus waveform cross-correlation measurements usable for fully probabilistic sampling strategies, in source inversion and related applications such as seismic tomography.
The recursive maximum likelihood proportion estimator: User's guide and test results
NASA Technical Reports Server (NTRS)
Vanrooy, D. L.
1976-01-01
Implementation of the recursive maximum likelihood proportion estimator is described. A user's guide to programs as they currently exist on the IBM 360/67 at LARS, Purdue is included, and test results on LANDSAT data are described. On Hill County data, the algorithm yields results comparable to the standard maximum likelihood proportion estimator.
Xie, Lin; Cui, Xiaowei; Zhao, Sihao; Lu, Mingquan
2017-01-01
It is well known that multipath effect remains a dominant error source that affects the positioning accuracy of Global Navigation Satellite System (GNSS) receivers. Significant efforts have been made by researchers and receiver manufacturers to mitigate multipath error in the past decades. Recently, a multipath mitigation technique using dual-polarization antennas has become a research hotspot for it provides another degree of freedom to distinguish the line-of-sight (LOS) signal from the LOS and multipath composite signal without extensively increasing the complexity of the receiver. Numbers of multipath mitigation techniques using dual-polarization antennas have been proposed and all of them report performance improvement over the single-polarization methods. However, due to the unpredictability of multipath, multipath mitigation techniques based on dual-polarization are not always effective while few studies discuss the condition under which the multipath mitigation using a dual-polarization antenna can outperform that using a single-polarization antenna, which is a fundamental question for dual-polarization multipath mitigation (DPMM) and the design of multipath mitigation algorithms. In this paper we analyze the characteristics of the signal received by a dual-polarization antenna and use the maximum likelihood estimation (MLE) to assess the theoretical performance of DPMM in different received signal cases. Based on the assessment we answer this fundamental question and find the dual-polarization antenna’s capability in mitigating short delay multipath—the most challenging one among all types of multipath for the majority of the multipath mitigation techniques. Considering these effective conditions, we propose a dual-polarization sequential iterative maximum likelihood estimation (DP-SIMLE) algorithm for DPMM. The simulation results verify our theory and show superior performance of the proposed DP-SIMLE algorithm over the traditional one using only an RHCP antenna. PMID:28208832
Bit Error Probability for Maximum Likelihood Decoding of Linear Block Codes
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc P. C.; Rhee, Dojun
1996-01-01
In this paper, the bit error probability P(sub b) for maximum likelihood decoding of binary linear codes is investigated. The contribution of each information bit to P(sub b) is considered. For randomly generated codes, it is shown that the conventional approximation at high SNR P(sub b) is approximately equal to (d(sub H)/N)P(sub s), where P(sub s) represents the block error probability, holds for systematic encoding only. Also systematic encoding provides the minimum P(sub b) when the inverse mapping corresponding to the generator matrix of the code is used to retrieve the information sequence. The bit error performances corresponding to other generator matrix forms are also evaluated. Although derived for codes with a generator matrix randomly generated, these results are shown to provide good approximations for codes used in practice. Finally, for decoding methods which require a generator matrix with a particular structure such as trellis decoding or algebraic-based soft decision decoding, equivalent schemes that reduce the bit error probability are discussed.
Accounting for misclassification error in retrospective smoking data.
Kenkel, Donald S; Lillard, Dean R; Mathios, Alan D
2004-10-01
Recent waves of major longitudinal surveys in the US and other countries include retrospective questions about the timing of smoking initiation and cessation, creating a potentially important but under-utilized source of information on smoking behavior over the life course. In this paper, we explore the extent of, consequences of, and possible solutions to misclassification errors in models of smoking participation that use data generated from retrospective reports. In our empirical work, we exploit the fact that the National Longitudinal Survey of Youth 1979 provides both contemporaneous and retrospective information about smoking status in certain years. We compare the results from four sets of models of smoking participation. The first set of results are from baseline probit models of smoking participation from contemporaneously reported information. The second set of results are from models that are identical except that the dependent variable is based on retrospective information. The last two sets of results are from models that take a parametric approach to account for a simple form of misclassification error. Our preliminary results suggest that accounting for misclassification error is important. However, the adjusted maximum likelihood estimation approach to account for misclassification does not always perform as expected. Copyright 2004 John Wiley & Sons, Ltd.
Demodulation Algorithms for the Ofdm Signals in the Time- and Frequency-Scattering Channels
NASA Astrophysics Data System (ADS)
Bochkov, G. N.; Gorokhov, K. V.; Kolobkov, A. V.
2016-06-01
We consider a method based on the generalized maximum-likelihood rule for solving the problem of reception of the signals with orthogonal frequency division multiplexing of their harmonic components (OFDM signals) in the time- and frequency-scattering channels. The coherent and incoherent demodulators effectively using the time scattering due to the fast fading of the signal are developed. Using computer simulation, we performed comparative analysis of the proposed algorithms and well-known signal-reception algorithms with equalizers. The proposed symbolby-symbol detector with decision feedback and restriction of the number of searched variants is shown to have the best bit-error-rate performance. It is shown that under conditions of the limited accuracy of estimating the communication-channel parameters, the incoherent OFDMsignal detectors with differential phase-shift keying can ensure a better bit-error-rate performance compared with the coherent OFDM-signal detectors with absolute phase-shift keying.
NASA Technical Reports Server (NTRS)
Van Buren, Dave
1986-01-01
Equivalent width data from Copernicus and IUE appear to have an exponential, rather than a Gaussian distribution of errors. This is probably because there is one dominant source of error: the assignment of the background continuum shape. The maximum likelihood method of parameter estimation is presented for the case of exponential statistics, in enough generality for application to many problems. The method is applied to global fitting of Si II, Fe II, and Mn II oscillator strengths and interstellar gas parameters along many lines of sight. The new values agree in general with previous determinations but are usually much more tightly constrained. Finally, it is shown that care must be taken in deriving acceptable regions of parameter space because the probability contours are not generally ellipses whose axes are parallel to the coordinate axes.
A goodness-of-fit test for capture-recapture model M(t) under closure
Stanley, T.R.; Burnham, K.P.
1999-01-01
A new, fully efficient goodness-of-fit test for the time-specific closed-population capture-recapture model M(t) is presented. This test is based on the residual distribution of the capture history data given the maximum likelihood parameter estimates under model M(t), is partitioned into informative components, and is based on chi-square statistics. Comparison of this test with Leslie's test (Leslie, 1958, Journal of Animal Ecology 27, 84- 86) for model M(t), using Monte Carlo simulations, shows the new test generally outperforms Leslie's test. The new test is frequently computable when Leslie's test is not, has Type I error rates that are closer to nominal error rates than Leslie's test, and is sensitive to behavioral variation and heterogeneity in capture probabilities. Leslie's test is not sensitive to behavioral variation in capture probabilities but, when computable, has greater power to detect heterogeneity than the new test.
Yang, Lei; Qin, Guoyou; Zhao, Naiqing; Wang, Chunfang; Song, Guixiang
2012-10-30
Generalized Additive Model (GAM) provides a flexible and effective technique for modelling nonlinear time-series in studies of the health effects of environmental factors. However, GAM assumes that errors are mutually independent, while time series can be correlated in adjacent time points. Here, a GAM with Autoregressive terms (GAMAR) is introduced to fill this gap. Parameters in GAMAR are estimated by maximum partial likelihood using modified Newton's method, and the difference between GAM and GAMAR is demonstrated using two simulation studies and a real data example. GAMM is also compared to GAMAR in simulation study 1. In the simulation studies, the bias of the mean estimates from GAM and GAMAR are similar but GAMAR has better coverage and smaller relative error. While the results from GAMM are similar to GAMAR, the estimation procedure of GAMM is much slower than GAMAR. In the case study, the Pearson residuals from the GAM are correlated, while those from GAMAR are quite close to white noise. In addition, the estimates of the temperature effects are different between GAM and GAMAR. GAMAR incorporates both explanatory variables and AR terms so it can quantify the nonlinear impact of environmental factors on health outcome as well as the serial correlation between the observations. It can be a useful tool in environmental epidemiological studies.
2012-01-01
Background Generalized Additive Model (GAM) provides a flexible and effective technique for modelling nonlinear time-series in studies of the health effects of environmental factors. However, GAM assumes that errors are mutually independent, while time series can be correlated in adjacent time points. Here, a GAM with Autoregressive terms (GAMAR) is introduced to fill this gap. Methods Parameters in GAMAR are estimated by maximum partial likelihood using modified Newton’s method, and the difference between GAM and GAMAR is demonstrated using two simulation studies and a real data example. GAMM is also compared to GAMAR in simulation study 1. Results In the simulation studies, the bias of the mean estimates from GAM and GAMAR are similar but GAMAR has better coverage and smaller relative error. While the results from GAMM are similar to GAMAR, the estimation procedure of GAMM is much slower than GAMAR. In the case study, the Pearson residuals from the GAM are correlated, while those from GAMAR are quite close to white noise. In addition, the estimates of the temperature effects are different between GAM and GAMAR. Conclusions GAMAR incorporates both explanatory variables and AR terms so it can quantify the nonlinear impact of environmental factors on health outcome as well as the serial correlation between the observations. It can be a useful tool in environmental epidemiological studies. PMID:23110601
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
Chen, Rui; Hyrien, Ollivier
2011-01-01
This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
NASA Astrophysics Data System (ADS)
Bovy Jo; Hogg, David W.; Roweis, Sam T.
2011-06-01
We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-"erge- procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional veloc! ity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.
ESTIMATION OF THE NUMBER OF INFECTIOUS BACTERIAL OR VIRAL PARTICLES BY THE DILUTION METHOD
Seligman, Stephen J.; Mickey, M. Ray
1964-01-01
Seligman, Stephen J. (University of California, Los Angeles), and M. Ray Mickey. Estimation of the number of infectious bacterial or viral particles by the dilution method. J. Bacteriol. 88:31–36. 1964.—For viral or bacterial systems in which discrete foci of infection are not obtainable, it is possible to obtain an estimate of the number of infectious particles by use of the quantal response if the assay system is such that one infectious particle can elicit the response. Unfortunately, the maximum likelihood estimate is difficult to calculate, but, by the use of a modification of Haldane's approximation, it is possible to construct a table which facilitates calculation of both the average number of infectious particles and its relative error. Additional advantages of the method are that the number of test units per dilution can be varied, the dilutions need not bear any fixed relation to each other, and the one-particle hypothesis can be readily tested. PMID:14197902
Logarithmic Laplacian Prior Based Bayesian Inverse Synthetic Aperture Radar Imaging.
Zhang, Shuanghui; Liu, Yongxiang; Li, Xiang; Bi, Guoan
2016-04-28
This paper presents a novel Inverse Synthetic Aperture Radar Imaging (ISAR) algorithm based on a new sparse prior, known as the logarithmic Laplacian prior. The newly proposed logarithmic Laplacian prior has a narrower main lobe with higher tail values than the Laplacian prior, which helps to achieve performance improvement on sparse representation. The logarithmic Laplacian prior is used for ISAR imaging within the Bayesian framework to achieve better focused radar image. In the proposed method of ISAR imaging, the phase errors are jointly estimated based on the minimum entropy criterion to accomplish autofocusing. The maximum a posterior (MAP) estimation and the maximum likelihood estimation (MLE) are utilized to estimate the model parameters to avoid manually tuning process. Additionally, the fast Fourier Transform (FFT) and Hadamard product are used to minimize the required computational efficiency. Experimental results based on both simulated and measured data validate that the proposed algorithm outperforms the traditional sparse ISAR imaging algorithms in terms of resolution improvement and noise suppression.
Stochastic differential equation (SDE) model of opening gold share price of bursa saham malaysia
NASA Astrophysics Data System (ADS)
Hussin, F. N.; Rahman, H. A.; Bahar, A.
2017-09-01
Black and Scholes option pricing model is one of the most recognized stochastic differential equation model in mathematical finance. Two parameter estimation methods have been utilized for the Geometric Brownian model (GBM); historical and discrete method. The historical method is a statistical method which uses the property of independence and normality logarithmic return, giving out the simplest parameter estimation. Meanwhile, discrete method considers the function of density of transition from the process of diffusion normal log which has been derived from maximum likelihood method. These two methods are used to find the parameter estimates samples of Malaysians Gold Share Price data such as: Financial Times and Stock Exchange (FTSE) Bursa Malaysia Emas, and Financial Times and Stock Exchange (FTSE) Bursa Malaysia Emas Shariah. Modelling of gold share price is essential since fluctuation of gold affects worldwide economy nowadays, including Malaysia. It is found that discrete method gives the best parameter estimates than historical method due to the smallest Root Mean Square Error (RMSE) value.
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15
ERIC Educational Resources Information Center
Zhang, Jinming
2005-01-01
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Cha, Kenny H.; Hadjiiski, Lubomir; Samala, Ravi K.; Chan, Heang-Ping; Caoili, Elaine M.; Cohan, Richard H.
2016-01-01
Purpose: The authors are developing a computerized system for bladder segmentation in CT urography (CTU) as a critical component for computer-aided detection of bladder cancer. Methods: A deep-learning convolutional neural network (DL-CNN) was trained to distinguish between the inside and the outside of the bladder using 160 000 regions of interest (ROI) from CTU images. The trained DL-CNN was used to estimate the likelihood of an ROI being inside the bladder for ROIs centered at each voxel in a CTU case, resulting in a likelihood map. Thresholding and hole-filling were applied to the map to generate the initial contour for the bladder, which was then refined by 3D and 2D level sets. The segmentation performance was evaluated using 173 cases: 81 cases in the training set (42 lesions, 21 wall thickenings, and 18 normal bladders) and 92 cases in the test set (43 lesions, 36 wall thickenings, and 13 normal bladders). The computerized segmentation accuracy using the DL likelihood map was compared to that using a likelihood map generated by Haar features and a random forest classifier, and that using our previous conjoint level set analysis and segmentation system (CLASS) without using a likelihood map. All methods were evaluated relative to the 3D hand-segmented reference contours. Results: With DL-CNN-based likelihood map and level sets, the average volume intersection ratio, average percent volume error, average absolute volume error, average minimum distance, and the Jaccard index for the test set were 81.9% ± 12.1%, 10.2% ± 16.2%, 14.0% ± 13.0%, 3.6 ± 2.0 mm, and 76.2% ± 11.8%, respectively. With the Haar-feature-based likelihood map and level sets, the corresponding values were 74.3% ± 12.7%, 13.0% ± 22.3%, 20.5% ± 15.7%, 5.7 ± 2.6 mm, and 66.7% ± 12.6%, respectively. With our previous CLASS with local contour refinement (LCR) method, the corresponding values were 78.0% ± 14.7%, 16.5% ± 16.8%, 18.2% ± 15.0%, 3.8 ± 2.3 mm, and 73.9% ± 13.5%, respectively. Conclusions: The authors demonstrated that the DL-CNN can overcome the strong boundary between two regions that have large difference in gray levels and provides a seamless mask to guide level set segmentation, which has been a problem for many gradient-based segmentation methods. Compared to our previous CLASS with LCR method, which required two user inputs to initialize the segmentation, DL-CNN with level sets achieved better segmentation performance while using a single user input. Compared to the Haar-feature-based likelihood map, the DL-CNN-based likelihood map could guide the level sets to achieve better segmentation. The results demonstrate the feasibility of our new approach of using DL-CNN in combination with level sets for segmentation of the bladder. PMID:27036584
The GEOS Ozone Data Assimilation System: Specification of Error Statistics
NASA Technical Reports Server (NTRS)
Stajner, Ivanka; Riishojgaard, Lars Peter; Rood, Richard B.
2000-01-01
A global three-dimensional ozone data assimilation system has been developed at the Data Assimilation Office of the NASA/Goddard Space Flight Center. The Total Ozone Mapping Spectrometer (TOMS) total ozone and the Solar Backscatter Ultraviolet (SBUV) or (SBUV/2) partial ozone profile observations are assimilated. The assimilation, into an off-line ozone transport model, is done using the global Physical-space Statistical Analysis Scheme (PSAS). This system became operational in December 1999. A detailed description of the statistical analysis scheme, and in particular, the forecast and observation error covariance models is given. A new global anisotropic horizontal forecast error correlation model accounts for a varying distribution of observations with latitude. Correlations are largest in the zonal direction in the tropics where data is sparse. Forecast error variance model is proportional to the ozone field. The forecast error covariance parameters were determined by maximum likelihood estimation. The error covariance models are validated using x squared statistics. The analyzed ozone fields in the winter 1992 are validated against independent observations from ozone sondes and HALOE. There is better than 10% agreement between mean Halogen Occultation Experiment (HALOE) and analysis fields between 70 and 0.2 hPa. The global root-mean-square (RMS) difference between TOMS observed and forecast values is less than 4%. The global RMS difference between SBUV observed and analyzed ozone between 50 and 3 hPa is less than 15%.
Schwartz, Rachel S; Mueller, Rachel L
2010-01-11
Estimates of divergence dates between species improve our understanding of processes ranging from nucleotide substitution to speciation. Such estimates are frequently based on molecular genetic differences between species; therefore, they rely on accurate estimates of the number of such differences (i.e. substitutions per site, measured as branch length on phylogenies). We used simulations to determine the effects of dataset size, branch length heterogeneity, branch depth, and analytical framework on branch length estimation across a range of branch lengths. We then reanalyzed an empirical dataset for plethodontid salamanders to determine how inaccurate branch length estimation can affect estimates of divergence dates. The accuracy of branch length estimation varied with branch length, dataset size (both number of taxa and sites), branch length heterogeneity, branch depth, dataset complexity, and analytical framework. For simple phylogenies analyzed in a Bayesian framework, branches were increasingly underestimated as branch length increased; in a maximum likelihood framework, longer branch lengths were somewhat overestimated. Longer datasets improved estimates in both frameworks; however, when the number of taxa was increased, estimation accuracy for deeper branches was less than for tip branches. Increasing the complexity of the dataset produced more misestimated branches in a Bayesian framework; however, in an ML framework, more branches were estimated more accurately. Using ML branch length estimates to re-estimate plethodontid salamander divergence dates generally resulted in an increase in the estimated age of older nodes and a decrease in the estimated age of younger nodes. Branch lengths are misestimated in both statistical frameworks for simulations of simple datasets. However, for complex datasets, length estimates are quite accurate in ML (even for short datasets), whereas few branches are estimated accurately in a Bayesian framework. Our reanalysis of empirical data demonstrates the magnitude of effects of Bayesian branch length misestimation on divergence date estimates. Because the length of branches for empirical datasets can be estimated most reliably in an ML framework when branches are <1 substitution/site and datasets are > or =1 kb, we suggest that divergence date estimates using datasets, branch lengths, and/or analytical techniques that fall outside of these parameters should be interpreted with caution.
Discriminative confidence estimation for probabilistic multi-atlas label fusion.
Benkarim, Oualid M; Piella, Gemma; González Ballester, Miguel Angel; Sanroma, Gerard
2017-12-01
Quantitative neuroimaging analyses often rely on the accurate segmentation of anatomical brain structures. In contrast to manual segmentation, automatic methods offer reproducible outputs and provide scalability to study large databases. Among existing approaches, multi-atlas segmentation has recently shown to yield state-of-the-art performance in automatic segmentation of brain images. It consists in propagating the labelmaps from a set of atlases to the anatomy of a target image using image registration, and then fusing these multiple warped labelmaps into a consensus segmentation on the target image. Accurately estimating the contribution of each atlas labelmap to the final segmentation is a critical step for the success of multi-atlas segmentation. Common approaches to label fusion either rely on local patch similarity, probabilistic statistical frameworks or a combination of both. In this work, we propose a probabilistic label fusion framework based on atlas label confidences computed at each voxel of the structure of interest. Maximum likelihood atlas confidences are estimated using a supervised approach, explicitly modeling the relationship between local image appearances and segmentation errors produced by each of the atlases. We evaluate different spatial pooling strategies for modeling local segmentation errors. We also present a novel type of label-dependent appearance features based on atlas labelmaps that are used during confidence estimation to increase the accuracy of our label fusion. Our approach is evaluated on the segmentation of seven subcortical brain structures from the MICCAI 2013 SATA Challenge dataset and the hippocampi from the ADNI dataset. Overall, our results indicate that the proposed label fusion framework achieves superior performance to state-of-the-art approaches in the majority of the evaluated brain structures and shows more robustness to registration errors. Copyright © 2017 Elsevier B.V. All rights reserved.
Dawson, Ree; Lavori, Philip W
2012-01-01
Clinical demand for individualized "adaptive" treatment policies in diverse fields has spawned development of clinical trial methodology for their experimental evaluation via multistage designs, building upon methods intended for the analysis of naturalistically observed strategies. Because often there is no need to parametrically smooth multistage trial data (in contrast to observational data for adaptive strategies), it is possible to establish direct connections among different methodological approaches. We show by algebraic proof that the maximum likelihood (ML) and optimal semiparametric (SP) estimators of the population mean of the outcome of a treatment policy and its standard error are equal under certain experimental conditions. This result is used to develop a unified and efficient approach to design and inference for multistage trials of policies that adapt treatment according to discrete responses. We derive a sample size formula expressed in terms of a parametric version of the optimal SP population variance. Nonparametric (sample-based) ML estimation performed well in simulation studies, in terms of achieved power, for scenarios most likely to occur in real studies, even though sample sizes were based on the parametric formula. ML outperformed the SP estimator; differences in achieved power predominately reflected differences in their estimates of the population mean (rather than estimated standard errors). Neither methodology could mitigate the potential for overestimated sample sizes when strong nonlinearity was purposely simulated for certain discrete outcomes; however, such departures from linearity may not be an issue for many clinical contexts that make evaluation of competitive treatment policies meaningful.
Berniker, Max; Kording, Konrad P.
2011-01-01
Recent studies suggest that motor adaptation is the result of multiple, perhaps linear processes each with distinct time scales. While these models are consistent with some motor phenomena, they can neither explain the relatively fast re-adaptation after a long washout period, nor savings on a subsequent day. Here we examined if these effects can be explained if we assume that the CNS stores and retrieves movement parameters based on their possible relevance. We formalize this idea with a model that infers not only the sources of potential motor errors, but also their relevance to the current motor circumstances. In our model adaptation is the process of re-estimating parameters that represent the body and the world. The likelihood of a world parameter being relevant is then based on the mismatch between an observed movement and that predicted when not compensating for the estimated world disturbance. As such, adapting to large motor errors in a laboratory setting should alert subjects that disturbances are being imposed on them, even after motor performance has returned to baseline. Estimates of this external disturbance should be relevant both now and in future laboratory settings. Estimated properties of our bodies on the other hand should always be relevant. Our model demonstrates savings, interference, spontaneous rebound and differences between adaptation to sudden and gradual disturbances. We suggest that many issues concerning savings and interference can be understood when adaptation is conditioned on the relevance of parameters. PMID:21998574
Burgués, Javier; Marco, Santiago
2018-08-17
Metal oxide semiconductor (MOX) sensors are usually temperature-modulated and calibrated with multivariate models such as partial least squares (PLS) to increase the inherent low selectivity of this technology. The multivariate sensor response patterns exhibit heteroscedastic and correlated noise, which suggests that maximum likelihood methods should outperform PLS. One contribution of this paper is the comparison between PLS and maximum likelihood principal components regression (MLPCR) in MOX sensors. PLS is often criticized by the lack of interpretability when the model complexity increases beyond the chemical rank of the problem. This happens in MOX sensors due to cross-sensitivities to interferences, such as temperature or humidity and non-linearity. Additionally, the estimation of fundamental figures of merit, such as the limit of detection (LOD), is still not standardized in multivariate models. Orthogonalization methods, such as orthogonal projection to latent structures (O-PLS), have been successfully applied in other fields to reduce the complexity of PLS models. In this work, we propose a LOD estimation method based on applying the well-accepted univariate LOD formulas to the scores of the first component of an orthogonal PLS model. The resulting LOD is compared to the multivariate LOD range derived from error-propagation. The methodology is applied to data extracted from temperature-modulated MOX sensors (FIS SB-500-12 and Figaro TGS 3870-A04), aiming at the detection of low concentrations of carbon monoxide in the presence of uncontrolled humidity (chemical noise). We found that PLS models were simpler and more accurate than MLPCR models. Average LOD values of 0.79 ppm (FIS) and 1.06 ppm (Figaro) were found using the approach described in this paper. These values were contained within the LOD ranges obtained with the error-propagation approach. The mean LOD increased to 1.13 ppm (FIS) and 1.59 ppm (Figaro) when considering validation samples collected two weeks after calibration, which represents a 43% and 46% degradation, respectively. The orthogonal score-plot was a very convenient tool to visualize MOX sensor data and to validate the LOD estimates. Copyright © 2018 Elsevier B.V. All rights reserved.
Testing the non-unity of rate ratio under inverse sampling.
Tang, Man-Lai; Liao, Yi Jie; Ng, Hong Keung Tony; Chan, Ping Shing
2007-08-01
Inverse sampling is considered to be a more appropriate sampling scheme than the usual binomial sampling scheme when subjects arrive sequentially, when the underlying response of interest is acute, and when maximum likelihood estimators of some epidemiologic indices are undefined. In this article, we study various statistics for testing non-unity rate ratios in case-control studies under inverse sampling. These include the Wald, unconditional score, likelihood ratio and conditional score statistics. Three methods (the asymptotic, conditional exact, and Mid-P methods) are adopted for P-value calculation. We evaluate the performance of different combinations of test statistics and P-value calculation methods in terms of their empirical sizes and powers via Monte Carlo simulation. In general, asymptotic score and conditional score tests are preferable for their actual type I error rates are well controlled around the pre-chosen nominal level, and their powers are comparatively the largest. The exact version of Wald test is recommended if one wants to control the actual type I error rate at or below the pre-chosen nominal level. If larger power is expected and fluctuation of sizes around the pre-chosen nominal level are allowed, then the Mid-P version of Wald test is a desirable alternative. We illustrate the methodologies with a real example from a heart disease study. (c) 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Robust geostatistical analysis of spatial data
NASA Astrophysics Data System (ADS)
Papritz, A.; Künsch, H. R.; Schwierz, C.; Stahel, W. A.
2012-04-01
Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outlying observations may results from errors (e.g. in data transcription) or from local perturbations in the processes that are responsible for a given pattern of spatial variation. As an example, the spatial distribution of some trace metal in the soils of a region may be distorted by emissions of local anthropogenic sources. Outliers affect the modelling of the large-scale spatial variation, the so-called external drift or trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) [2] proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) [1] for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation. Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and unsampled locations and kriging variances. The method has been implemented in an R package. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis of the Tarrawarra soil moisture data set [3].
Human factors evaluation of remote afterloading brachytherapy. Volume 2, Function and task analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Callan, J.R.; Gwynne, J.W. III; Kelly, T.T.
1995-05-01
A human factors project on the use of nuclear by-product material to treat cancer using remotely operated afterloaders was undertaken by the Nuclear Regulatory Commission. The purpose of the project was to identify factors that contribute to human error in the system for remote afterloading brachytherapy (RAB). This report documents the findings from the first phase of the project, which involved an extensive function and task analysis of RAB. This analysis identified the functions and tasks in RAB, made preliminary estimates of the likelihood of human error in each task, and determined the skills needed to perform each RAB task.more » The findings of the function and task analysis served as the foundation for the remainder of the project, which evaluated four major aspects of the RAB system linked to human error: human-system interfaces; procedures and practices; training and qualifications of RAB staff; and organizational practices and policies. At its completion, the project identified and prioritized areas for recommended NRC and industry attention based on all of the evaluations and analyses.« less
Revision of laser-induced damage threshold evaluation from damage probability data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bataviciute, Gintare; Grigas, Povilas; Smalakys, Linas
2013-04-15
In this study, the applicability of commonly used Damage Frequency Method (DFM) is addressed in the context of Laser-Induced Damage Threshold (LIDT) testing with pulsed lasers. A simplified computer model representing the statistical interaction between laser irradiation and randomly distributed damage precursors is applied for Monte Carlo experiments. The reproducibility of LIDT predicted from DFM is examined under both idealized and realistic laser irradiation conditions by performing numerical 1-on-1 tests. A widely accepted linear fitting resulted in systematic errors when estimating LIDT and its error bars. For the same purpose, a Bayesian approach was proposed. A novel concept of parametricmore » regression based on varying kernel and maximum likelihood fitting technique is introduced and studied. Such approach exhibited clear advantages over conventional linear fitting and led to more reproducible LIDT evaluation. Furthermore, LIDT error bars are obtained as a natural outcome of parametric fitting which exhibit realistic values. The proposed technique has been validated on two conventionally polished fused silica samples (355 nm, 5.7 ns).« less
An alternative method to measure the likelihood of a financial crisis in an emerging market
NASA Astrophysics Data System (ADS)
Özlale, Ümit; Metin-Özcan, Kıvılcım
2007-07-01
This paper utilizes an early warning system in order to measure the likelihood of a financial crisis in an emerging market economy. We introduce a methodology, where we can both obtain a likelihood series and analyze the time-varying effects of several macroeconomic variables on this likelihood. Since the issue is analyzed in a non-linear state space framework, the extended Kalman filter emerges as the optimal estimation algorithm. Taking the Turkish economy as our laboratory, the results indicate that both the derived likelihood measure and the estimated time-varying parameters are meaningful and can successfully explain the path that the Turkish economy had followed between 2000 and 2006. The estimated parameters also suggest that overvalued domestic currency, current account deficit and the increase in the default risk increase the likelihood of having an economic crisis in the economy. Overall, the findings in this paper suggest that the estimation methodology introduced in this paper can also be applied to other emerging market economies as well.
Maximum likelihood estimation of signal-to-noise ratio and combiner weight
NASA Technical Reports Server (NTRS)
Kalson, S.; Dolinar, S. J.
1986-01-01
An algorithm for estimating signal to noise ratio and combiner weight parameters for a discrete time series is presented. The algorithm is based upon the joint maximum likelihood estimate of the signal and noise power. The discrete-time series are the sufficient statistics obtained after matched filtering of a biphase modulated signal in additive white Gaussian noise, before maximum likelihood decoding is performed.
Changren Weng; Thomas L. Kubisiak; C. Dana Nelson; James P. Geaghan; Michael Stine
1999-01-01
Single marker regression and single marker maximum likelihood estimation were tied to detect quantitative trait loci (QTLs) controlling the early height growth of longleaf pine and slash pine using a ((longleaf pine x slash pine) x slash pine) BC, population consisting of 83 progeny. Maximum likelihood estimation was found to be more power than regression and could...
NASA Astrophysics Data System (ADS)
Lu, Dan; Ricciuto, Daniel; Walker, Anthony; Safta, Cosmin; Munger, William
2017-09-01
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results in a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. The result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.
Testing for genetic association taking into account phenotypic information of relatives.
Uh, Hae-Won; Wijk, Henk Jan van der; Houwing-Duistermaat, Jeanine J
2009-12-15
We investigated efficient case-control association analysis using family data. The outcome of interest was coronary heart disease. We employed existing and new methods that take into account the correlations among related individuals to obtain the proper type I error rates. The methods considered for autosomal single-nucleotide polymorphisms were: 1) generalized estimating equations-based methods, 2) variance-modified Cochran-Armitage (MCA) trend test incorporating kinship coefficients, and 3) genotypic modified quasi-likelihood score test. Additionally, for X-linked single-nucleotide polymorphisms we proposed a two-degrees-of-freedom test. Performance of these methods was tested using Framingham Heart Study 500 k array data.
NASA Astrophysics Data System (ADS)
Lange, J.; O'Shaughnessy, R.; Boyle, M.; Calderón Bustillo, J.; Campanelli, M.; Chu, T.; Clark, J. A.; Demos, N.; Fong, H.; Healy, J.; Hemberger, D. A.; Hinder, I.; Jani, K.; Khamesra, B.; Kidder, L. E.; Kumar, P.; Laguna, P.; Lousto, C. O.; Lovelace, G.; Ossokine, S.; Pfeiffer, H.; Scheel, M. A.; Shoemaker, D. M.; Szilagyi, B.; Teukolsky, S.; Zlochower, Y.
2017-11-01
We present and assess a Bayesian method to interpret gravitational wave signals from binary black holes. Our method directly compares gravitational wave data to numerical relativity (NR) simulations. In this study, we present a detailed investigation of the systematic and statistical parameter estimation errors of this method. This procedure bypasses approximations used in semianalytical models for compact binary coalescence. In this work, we use the full posterior parameter distribution for only generic nonprecessing binaries, drawing inferences away from the set of NR simulations used, via interpolation of a single scalar quantity (the marginalized log likelihood, ln L ) evaluated by comparing data to nonprecessing binary black hole simulations. We also compare the data to generic simulations, and discuss the effectiveness of this procedure for generic sources. We specifically assess the impact of higher order modes, repeating our interpretation with both l ≤2 as well as l ≤3 harmonic modes. Using the l ≤3 higher modes, we gain more information from the signal and can better constrain the parameters of the gravitational wave signal. We assess and quantify several sources of systematic error that our procedure could introduce, including simulation resolution and duration; most are negligible. We show through examples that our method can recover the parameters for equal mass, zero spin, GW150914-like, and unequal mass, precessing spin sources. Our study of this new parameter estimation method demonstrates that we can quantify and understand the systematic and statistical error. This method allows us to use higher order modes from numerical relativity simulations to better constrain the black hole binary parameters.
Bayesian inference of Calibration curves: application to archaeomagnetism
NASA Astrophysics Data System (ADS)
Lanos, P.
2003-04-01
The range of errors that occur at different stages of the archaeomagnetic calibration process are modelled using a Bayesian hierarchical model. The archaeomagnetic data obtained from archaeological structures such as hearths, kilns or sets of bricks and tiles, exhibit considerable experimental errors and are typically more or less well dated by archaeological context, history or chronometric methods (14C, TL, dendrochronology, etc.). They can also be associated with stratigraphic observations which provide prior relative chronological information. The modelling we describe in this paper allows all these observations, on materials from a given period, to be linked together, and the use of penalized maximum likelihood for smoothing univariate, spherical or three-dimensional time series data allows representation of the secular variation of the geomagnetic field over time. The smooth curve we obtain (which takes the form of a penalized natural cubic spline) provides an adaptation to the effects of variability in the density of reference points over time. Since our model takes account of all the known errors in the archaeomagnetic calibration process, we are able to obtain a functional highest-posterior-density envelope on the new curve. With this new posterior estimate of the curve available to us, the Bayesian statistical framework then allows us to estimate the calendar dates of undated archaeological features (such as kilns) based on one, two or three geomagnetic parameters (inclination, declination and/or intensity). Date estimates are presented in much the same way as those that arise from radiocarbon dating. In order to illustrate the model and inference methods used, we will present results based on German archaeomagnetic data recently published by a German team.
Haem, Elham; Harling, Kajsa; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf; Karlsson, Mats O
2017-02-01
One important aim in population pharmacokinetics (PK) and pharmacodynamics is identification and quantification of the relationships between the parameters and covariates. Lasso has been suggested as a technique for simultaneous estimation and covariate selection. In linear regression, it has been shown that Lasso possesses no oracle properties, which means it asymptotically performs as though the true underlying model was given in advance. Adaptive Lasso (ALasso) with appropriate initial weights is claimed to possess oracle properties; however, it can lead to poor predictive performance when there is multicollinearity between covariates. This simulation study implemented a new version of ALasso, called adjusted ALasso (AALasso), to take into account the ratio of the standard error of the maximum likelihood (ML) estimator to the ML coefficient as the initial weight in ALasso to deal with multicollinearity in non-linear mixed-effect models. The performance of AALasso was compared with that of ALasso and Lasso. PK data was simulated in four set-ups from a one-compartment bolus input model. Covariates were created by sampling from a multivariate standard normal distribution with no, low (0.2), moderate (0.5) or high (0.7) correlation. The true covariates influenced only clearance at different magnitudes. AALasso, ALasso and Lasso were compared in terms of mean absolute prediction error and error of the estimated covariate coefficient. The results show that AALasso performed better in small data sets, even in those in which a high correlation existed between covariates. This makes AALasso a promising method for covariate selection in nonlinear mixed-effect models.
Maximum likelihood estimation of finite mixture model for economic data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Maximum Likelihood Estimation with Emphasis on Aircraft Flight Data
NASA Technical Reports Server (NTRS)
Iliff, K. W.; Maine, R. E.
1985-01-01
Accurate modeling of flexible space structures is an important field that is currently under investigation. Parameter estimation, using methods such as maximum likelihood, is one of the ways that the model can be improved. The maximum likelihood estimator has been used to extract stability and control derivatives from flight data for many years. Most of the literature on aircraft estimation concentrates on new developments and applications, assuming familiarity with basic estimation concepts. Some of these basic concepts are presented. The maximum likelihood estimator and the aircraft equations of motion that the estimator uses are briefly discussed. The basic concepts of minimization and estimation are examined for a simple computed aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to help illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Specific examples of estimation of structural dynamics are included. Some of the major conclusions for the computed example are also developed for the analysis of flight data.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
New results and insights concerning a previously published iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions were discussed. It was shown that the procedure converges locally to the consistent maximum likelihood estimate as long as a specified parameter is bounded between two limits. Bound values were given to yield optimal local convergence.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
Polarimetric image reconstruction algorithms
NASA Astrophysics Data System (ADS)
Valenzuela, John R.
In the field of imaging polarimetry Stokes parameters are sought and must be inferred from noisy and blurred intensity measurements. Using a penalized-likelihood estimation framework we investigate reconstruction quality when estimating intensity images and then transforming to Stokes parameters (traditional estimator), and when estimating Stokes parameters directly (Stokes estimator). We define our cost function for reconstruction by a weighted least squares data fit term and a regularization penalty. It is shown that under quadratic regularization, the traditional and Stokes estimators can be made equal by appropriate choice of regularization parameters. It is empirically shown that, when using edge preserving regularization, estimating the Stokes parameters directly leads to lower RMS error in reconstruction. Also, the addition of a cross channel regularization term further lowers the RMS error for both methods especially in the case of low SNR. The technique of phase diversity has been used in traditional incoherent imaging systems to jointly estimate an object and optical system aberrations. We extend the technique of phase diversity to polarimetric imaging systems. Specifically, we describe penalized-likelihood methods for jointly estimating Stokes images and optical system aberrations from measurements that contain phase diversity. Jointly estimating Stokes images and optical system aberrations involves a large parameter space. A closed-form expression for the estimate of the Stokes images in terms of the aberration parameters is derived and used in a formulation that reduces the dimensionality of the search space to the number of aberration parameters only. We compare the performance of the joint estimator under both quadratic and edge-preserving regularization. The joint estimator with edge-preserving regularization yields higher fidelity polarization estimates than with quadratic regularization. Under quadratic regularization, using the reduced-parameter search strategy, accurate aberration estimates can be obtained without recourse to regularization "tuning". Phase-diverse wavefront sensing is emerging as a viable candidate wavefront sensor for adaptive-optics systems. In a quadratically penalized weighted least squares estimation framework a closed form expression for the object being imaged in terms of the aberrations in the system is available. This expression offers a dramatic reduction of the dimensionality of the estimation problem and thus is of great interest for practical applications. We have derived an expression for an approximate joint covariance matrix for object and aberrations in the phase diversity context. Our expression for the approximate joint covariance is compared with the "known-object" Cramer-Rao lower bound that is typically used for system parameter optimization. Estimates of the optimal amount of defocus in a phase-diverse wavefront sensor derived from the joint-covariance matrix, the known-object Cramer-Rao bound, and Monte Carlo simulations are compared for an extended scene and a point object. It is found that our variance approximation, that incorporates the uncertainty of the object, leads to an improvement in predicting the optimal amount of defocus to use in a phase-diverse wavefront sensor.
NASA Astrophysics Data System (ADS)
Olurotimi, E. O.; Sokoya, O.; Ojo, J. S.; Owolawi, P. A.
2018-03-01
Rain height is one of the significant parameters for prediction of rain attenuation for Earth-space telecommunication links, especially those operating at frequencies above 10 GHz. This study examines Three-parameter Dagum distribution of the rain height over Durban, South Africa. 5-year data were used to study the monthly, seasonal, and annual variations using the parameters estimated by the maximum likelihood of the distribution. The performance estimation of the distribution was determined using the statistical goodness of fit. Three-parameter Dagum distribution shows an appropriate distribution for the modeling of rain height over Durban with the Root Mean Square Error of 0.26. Also, the shape and scale parameters for the distribution show a wide variation. The probability exceedance of time for 0.01% indicates the high probability of rain attenuation at higher frequencies.
Time-of-flight PET time calibration using data consistency
NASA Astrophysics Data System (ADS)
Defrise, Michel; Rezaei, Ahmadreza; Nuyts, Johan
2018-05-01
This paper presents new data driven methods for the time of flight (TOF) calibration of positron emission tomography (PET) scanners. These methods are derived from the consistency condition for TOF PET, they can be applied to data measured with an arbitrary tracer distribution and are numerically efficient because they do not require a preliminary image reconstruction from the non-TOF data. Two-dimensional simulations are presented for one of the methods, which only involves the two first moments of the data with respect to the TOF variable. The numerical results show that this method estimates the detector timing offsets with errors that are larger than those obtained via an initial non-TOF reconstruction, but remain smaller than of the TOF resolution and thereby have a limited impact on the quantitative accuracy of the activity image estimated with standard maximum likelihood reconstruction algorithms.
Efficient Bit-to-Symbol Likelihood Mappings
NASA Technical Reports Server (NTRS)
Moision, Bruce E.; Nakashima, Michael A.
2010-01-01
This innovation is an efficient algorithm designed to perform bit-to-symbol and symbol-to-bit likelihood mappings that represent a significant portion of the complexity of an error-correction code decoder for high-order constellations. Recent implementation of the algorithm in hardware has yielded an 8- percent reduction in overall area relative to the prior design.
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
ERIC Educational Resources Information Center
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Zhao, Junbo; Wang, Shaobu; Mili, Lamine; ...
2018-01-08
Here, this paper develops a robust power system state estimation framework with the consideration of measurement correlations and imperfect synchronization. In the framework, correlations of SCADA and Phasor Measurements (PMUs) are calculated separately through unscented transformation and a Vector Auto-Regression (VAR) model. In particular, PMU measurements during the waiting period of two SCADA measurement scans are buffered to develop the VAR model with robustly estimated parameters using projection statistics approach. The latter takes into account the temporal and spatial correlations of PMU measurements and provides redundant measurements to suppress bad data and mitigate imperfect synchronization. In case where the SCADAmore » and PMU measurements are not time synchronized, either the forecasted PMU measurements or the prior SCADA measurements from the last estimation run are leveraged to restore system observability. Then, a robust generalized maximum-likelihood (GM)-estimator is extended to integrate measurement error correlations and to handle the outliers in the SCADA and PMU measurements. Simulation results that stem from a comprehensive comparison with other alternatives under various conditions demonstrate the benefits of the proposed framework.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Junbo; Wang, Shaobu; Mili, Lamine
Here, this paper develops a robust power system state estimation framework with the consideration of measurement correlations and imperfect synchronization. In the framework, correlations of SCADA and Phasor Measurements (PMUs) are calculated separately through unscented transformation and a Vector Auto-Regression (VAR) model. In particular, PMU measurements during the waiting period of two SCADA measurement scans are buffered to develop the VAR model with robustly estimated parameters using projection statistics approach. The latter takes into account the temporal and spatial correlations of PMU measurements and provides redundant measurements to suppress bad data and mitigate imperfect synchronization. In case where the SCADAmore » and PMU measurements are not time synchronized, either the forecasted PMU measurements or the prior SCADA measurements from the last estimation run are leveraged to restore system observability. Then, a robust generalized maximum-likelihood (GM)-estimator is extended to integrate measurement error correlations and to handle the outliers in the SCADA and PMU measurements. Simulation results that stem from a comprehensive comparison with other alternatives under various conditions demonstrate the benefits of the proposed framework.« less
Analyzing animal movements using Brownian bridges.
Horne, Jon S; Garton, Edward O; Krone, Stephen M; Lewis, Jesse S
2007-09-01
By studying animal movements, researchers can gain insight into many of the ecological characteristics and processes important for understanding population-level dynamics. We developed a Brownian bridge movement model (BBMM) for estimating the expected movement path of an animal, using discrete location data obtained at relatively short time intervals. The BBMM is based on the properties of a conditional random walk between successive pairs of locations, dependent on the time between locations, the distance between locations, and the Brownian motion variance that is related to the animal's mobility. We describe two critical developments that enable widespread use of the BBMM, including a derivation of the model when location data are measured with error and a maximum likelihood approach for estimating the Brownian motion variance. After the BBMM is fitted to location data, an estimate of the animal's probability of occurrence can be generated for an area during the time of observation. To illustrate potential applications, we provide three examples: estimating animal home ranges, estimating animal migration routes, and evaluating the influence of fine-scale resource selection on animal movement patterns.
Early Teen Marriage and Future Poverty
DAHL, GORDON B.
2010-01-01
Both early teen marriage and dropping out of high school have historically been associated with a variety of negative outcomes, including higher poverty rates throughout life. Are these negative outcomes due to preexisting differences, or do they represent the causal effect of marriage and schooling choices? To better understand the true personal and societal consequences, in this article, I use an instrumental variables (IV) approach that takes advantage of variation in state laws regulating the age at which individuals are allowed to marry, drop out of school, and begin work. The baseline IV estimate indicates that a woman who marries young is 31 percentage points more likely to live in poverty when she is older. Similarly, a woman who drops out of school is 11 percentage points more likely to be poor. The results are robust to a variety of alternative specifications and estimation methods, including limited information maximum likelihood (LIML) estimation and a control function approach. While grouped ordinary least squares (OLS) estimates for the early teen marriage variable are also large, OLS estimates based on individual-level data are small, consistent with a large amount of measurement error. PMID:20879684
Early teen marriage and future poverty.
Dahl, Gordon B
2010-08-01
Both early teen marriage and dropping out of high school have historically been associated with a variety of negative outcomes, including higher poverty rates throughout life. Are these negative outcomes due to preexisting differences, or do they represent the causal effect of marriage and schooling choices? To better understand the true personal and societal consequences, in this article, I use an instrumental variables (IV) approach that takes advantage of variation in state laws regulating the age at which individuals are allowed to marry, drop out of school, and begin work. The baseline IV estimate indicates that a woman who marries young is 31 percentage points more likely to live in poverty when she is older. Similarly, a woman who drops out of school is 11 percentage points more likely to be poor. The results are robust to a variety of alternative specifications and estimation methods, including limited information maximum likelihood (LIML) estimation and a control function approach. While grouped ordinary least squares (OLS) estimates for the early teen marriage variable are also large, OLS estimates based on individual-level data are small, consistent with a large amount of measurement error
Cham, Heining; West, Stephen G.; Ma, Yue; Aiken, Leona S.
2012-01-01
A Monte Carlo simulation was conducted to investigate the robustness of four latent variable interaction modeling approaches (Constrained Product Indicator [CPI], Generalized Appended Product Indicator [GAPI], Unconstrained Product Indicator [UPI], and Latent Moderated Structural Equations [LMS]) under high degrees of non-normality of the observed exogenous variables. Results showed that the CPI and LMS approaches yielded biased estimates of the interaction effect when the exogenous variables were highly non-normal. When the violation of non-normality was not severe (normal; symmetric with excess kurtosis < 1), the LMS approach yielded the most efficient estimates of the latent interaction effect with the highest statistical power. In highly non-normal conditions, the GAPI and UPI approaches with ML estimation yielded unbiased latent interaction effect estimates, with acceptable actual Type-I error rates for both the Wald and likelihood ratio tests of interaction effect at N ≥ 500. An empirical example illustrated the use of the four approaches in testing a latent variable interaction between academic self-efficacy and positive family role models in the prediction of academic performance. PMID:23457417
NASA Technical Reports Server (NTRS)
da Silva, Arlindo
2010-01-01
A challenge common to many constituent data assimilation applications is the fact that one observes a much smaller fraction of the phase space that one wishes to estimate. For example, remotely sensed estimates of the column average concentrations are available, while one is faced with the problem of estimating 3D concentrations for initializing a prognostic model. This problem is exacerbated in the case of aerosols because the observable Aerosol Optical Depth (AOD) is not only a column integrated quantity, but it also sums over a large number of species (dust, sea-salt, carbonaceous and sulfate aerosols. An aerosol transport model when driven by high-resolution, state-of-the-art analysis of meteorological fields and realistic emissions can produce skillful forecasts even when no aerosol data is assimilated. The main task of aerosol data assimilation is to address the bias arising from inaccurate emissions, and Lagrangian misplacement of plumes induced by errors in the driving meteorological fields. As long as one decouples the meteorological and aerosol assimilation as we do here, the classic baroclinic growth of error is no longer the main order of business. We will describe an aerosol data assimilation scheme in which the analysis update step is conducted in observation space, using an adaptive maximum-likelihood scheme for estimating background errors in AOD space. This scheme includes e explicit sequential bias estimation as in Dee and da Silva. Unlikely existing aerosol data assimilation schemes we do not obtain analysis increments of the 3D concentrations by scaling the background profiles. Instead we explore the Lagrangian characteristics of the problem for generating local displacement ensembles. These high-resolution state-dependent ensembles are then used to parameterize the background errors and generate 3D aerosol increments. The algorithm has computational complexity running at a resolution of 1/4 degree, globally. We will present the result of assimilating AOD retrievals from MODIS (on both Aqua and TERRA satellites) from AERONET for validation. The impact on the GEOS-5 Aerosol Forecasting will be fully documented.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2012-01-01
This paper focuses on two estimators of ability with logistic item response theory models: the Bayesian modal (BM) estimator and the weighted likelihood (WL) estimator. For the BM estimator, Jeffreys' prior distribution is considered, and the corresponding estimator is referred to as the Jeffreys modal (JM) estimator. It is established that under…
NASA Technical Reports Server (NTRS)
Diorio, Kimberly A.; Voska, Ned (Technical Monitor)
2002-01-01
This viewgraph presentation provides information on Human Factors Process Failure Modes and Effects Analysis (HF PFMEA). HF PFMEA includes the following 10 steps: Describe mission; Define System; Identify human-machine; List human actions; Identify potential errors; Identify factors that effect error; Determine likelihood of error; Determine potential effects of errors; Evaluate risk; Generate solutions (manage error). The presentation also describes how this analysis was applied to a liquid oxygen pump acceptance test.
Estimators for longitudinal latent exposure models: examining measurement model assumptions.
Sánchez, Brisa N; Kim, Sehee; Sammel, Mary D
2017-06-15
Latent variable (LV) models are increasingly being used in environmental epidemiology as a way to summarize multiple environmental exposures and thus minimize statistical concerns that arise in multiple regression. LV models may be especially useful when multivariate exposures are collected repeatedly over time. LV models can accommodate a variety of assumptions but, at the same time, present the user with many choices for model specification particularly in the case of exposure data collected repeatedly over time. For instance, the user could assume conditional independence of observed exposure biomarkers given the latent exposure and, in the case of longitudinal latent exposure variables, time invariance of the measurement model. Choosing which assumptions to relax is not always straightforward. We were motivated by a study of prenatal lead exposure and mental development, where assumptions of the measurement model for the time-changing longitudinal exposure have appreciable impact on (maximum-likelihood) inferences about the health effects of lead exposure. Although we were not particularly interested in characterizing the change of the LV itself, imposing a longitudinal LV structure on the repeated multivariate exposure measures could result in high efficiency gains for the exposure-disease association. We examine the biases of maximum likelihood estimators when assumptions about the measurement model for the longitudinal latent exposure variable are violated. We adapt existing instrumental variable estimators to the case of longitudinal exposures and propose them as an alternative to estimate the health effects of a time-changing latent predictor. We show that instrumental variable estimators remain unbiased for a wide range of data generating models and have advantages in terms of mean squared error. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Multiple-hit parameter estimation in monolithic detectors.
Hunter, William C J; Barrett, Harrison H; Lewellen, Tom K; Miyaoka, Robert S
2013-02-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%-12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied.
An evaluation of percentile and maximum likelihood estimators of weibull paremeters
Stanley J. Zarnoch; Tommy R. Dell
1985-01-01
Two methods of estimating the three-parameter Weibull distribution were evaluated by computer simulation and field data comparison. Maximum likelihood estimators (MLB) with bias correction were calculated with the computer routine FITTER (Bailey 1974); percentile estimators (PCT) were those proposed by Zanakis (1979). The MLB estimators had superior smaller bias and...
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.
ERIC Educational Resources Information Center
Blackwood, Larry G.; Bradley, Edwin L.
1989-01-01
Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
ERIC Educational Resources Information Center
Yang, Xiangdong; Poggio, John C.; Glasnapp, Douglas R.
2006-01-01
The effects of five ability estimators, that is, maximum likelihood estimator, weighted likelihood estimator, maximum a posteriori, expected a posteriori, and Owen's sequential estimator, on the performances of the item response theory-based adaptive classification procedure on multiple categories were studied via simulations. The following…
Harrell-Williams, Leigh; Wolfe, Edward W
2014-01-01
Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.
Analysis of Point Based Image Registration Errors With Applications in Single Molecule Microscopy
Cohen, E. A. K.; Ober, R. J.
2014-01-01
We present an asymptotic treatment of errors involved in point-based image registration where control point (CP) localization is subject to heteroscedastic noise; a suitable model for image registration in fluorescence microscopy. Assuming an affine transform, CPs are used to solve a multivariate regression problem. With measurement errors existing for both sets of CPs this is an errors-in-variable problem and linear least squares is inappropriate; the correct method being generalized least squares. To allow for point dependent errors the equivalence of a generalized maximum likelihood and heteroscedastic generalized least squares model is achieved allowing previously published asymptotic results to be extended to image registration. For a particularly useful model of heteroscedastic noise where covariance matrices are scalar multiples of a known matrix (including the case where covariance matrices are multiples of the identity) we provide closed form solutions to estimators and derive their distribution. We consider the target registration error (TRE) and define a new measure called the localization registration error (LRE) believed to be useful, especially in microscopy registration experiments. Assuming Gaussianity of the CP localization errors, it is shown that the asymptotic distribution for the TRE and LRE are themselves Gaussian and the parameterized distributions are derived. Results are successfully applied to registration in single molecule microscopy to derive the key dependence of the TRE and LRE variance on the number of CPs and their associated photon counts. Simulations show asymptotic results are robust for low CP numbers and non-Gaussianity. The method presented here is shown to outperform GLS on real imaging data. PMID:24634573
Somarathna, P D S N; Minasny, Budiman; Malone, Brendan P; Stockmann, Uta; McBratney, Alex B
2018-08-01
Spatial modelling of environmental data commonly only considers spatial variability as the single source of uncertainty. In reality however, the measurement errors should also be accounted for. In recent years, infrared spectroscopy has been shown to offer low cost, yet invaluable information needed for digital soil mapping at meaningful spatial scales for land management. However, spectrally inferred soil carbon data are known to be less accurate compared to laboratory analysed measurements. This study establishes a methodology to filter out the measurement error variability by incorporating the measurement error variance in the spatial covariance structure of the model. The study was carried out in the Lower Hunter Valley, New South Wales, Australia where a combination of laboratory measured, and vis-NIR and MIR inferred topsoil and subsoil soil carbon data are available. We investigated the applicability of residual maximum likelihood (REML) and Markov Chain Monte Carlo (MCMC) simulation methods to generate parameters of the Matérn covariance function directly from the data in the presence of measurement error. The results revealed that the measurement error can be effectively filtered-out through the proposed technique. When the measurement error was filtered from the data, the prediction variance almost halved, which ultimately yielded a greater certainty in spatial predictions of soil carbon. Further, the MCMC technique was successfully used to define the posterior distribution of measurement error. This is an important outcome, as the MCMC technique can be used to estimate the measurement error if it is not explicitly quantified. Although this study dealt with soil carbon data, this method is amenable for filtering the measurement error of any kind of continuous spatial environmental data. Copyright © 2018 Elsevier B.V. All rights reserved.
Patterson, Mark E; Pace, Heather A; Fincham, Jack E
2013-09-01
Although error-reporting systems enable hospitals to accurately track safety climate through the identification of adverse events, these systems may be underused within a work climate of poor communication. The objective of this analysis is to identify the extent to which perceived communication climate among hospital pharmacists impacts medical error reporting rates. This cross-sectional study used survey responses from more than 5000 pharmacists responding to the 2010 Hospital Survey on Patient Safety Culture (HSOPSC). Two composite scores were constructed for "communication openness" and "feedback and about error," respectively. Error reporting frequency was defined from the survey question, "In the past 12 months, how many event reports have you filled out and submitted?" Multivariable logistic regressions were used to estimate the likelihood of medical error reporting conditional upon communication openness or feedback levels, controlling for pharmacist years of experience, hospital geographic region, and ownership status. Pharmacists with higher communication openness scores compared with lower scores were 40% more likely to have filed or submitted a medical error report in the past 12 months (OR, 1.4; 95% CI, 1.1-1.7; P = 0.004). In contrast, pharmacists with higher communication feedback scores were not any more likely than those with lower scores to have filed or submitted a medical report in the past 12 months (OR, 1.0; 95% CI, 0.8-1.3; P = 0.97). Hospital work climates that encourage pharmacists to freely communicate about problems related to patient safety is conducive to medical error reporting. The presence of feedback infrastructures about error may not be sufficient to induce error-reporting behavior.
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Maximum likelihood solution for inclination-only data in paleomagnetism
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2010-08-01
We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
Estimating parameter of Rayleigh distribution by using Maximum Likelihood method and Bayes method
NASA Astrophysics Data System (ADS)
Ardianti, Fitri; Sutarman
2018-01-01
In this paper, we use Maximum Likelihood estimation and Bayes method under some risk function to estimate parameter of Rayleigh distribution to know the best method. The prior knowledge which used in Bayes method is Jeffrey’s non-informative prior. Maximum likelihood estimation and Bayes method under precautionary loss function, entropy loss function, loss function-L 1 will be compared. We compare these methods by bias and MSE value using R program. After that, the result will be displayed in tables to facilitate the comparisons.
Mortensen, Stig B; Klim, Søren; Dammann, Bernd; Kristensen, Niels R; Madsen, Henrik; Overgaard, Rune V
2007-10-01
The non-linear mixed-effects model based on stochastic differential equations (SDEs) provides an attractive residual error model, that is able to handle serially correlated residuals typically arising from structural mis-specification of the true underlying model. The use of SDEs also opens up for new tools for model development and easily allows for tracking of unknown inputs and parameters over time. An algorithm for maximum likelihood estimation of the model has earlier been proposed, and the present paper presents the first general implementation of this algorithm. The implementation is done in Matlab and also demonstrates the use of parallel computing for improved estimation times. The use of the implementation is illustrated by two examples of application which focus on the ability of the model to estimate unknown inputs facilitated by the extension to SDEs. The first application is a deconvolution-type estimation of the insulin secretion rate based on a linear two-compartment model for C-peptide measurements. In the second application the model is extended to also give an estimate of the time varying liver extraction based on both C-peptide and insulin measurements.
Consistency of Rasch Model Parameter Estimation: A Simulation Study.
ERIC Educational Resources Information Center
van den Wollenberg, Arnold L.; And Others
1988-01-01
The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
Estimation of the probability of success in petroleum exploration
Davis, J.C.
1977-01-01
A probabilistic model for oil exploration can be developed by assessing the conditional relationship between perceived geologic variables and the subsequent discovery of petroleum. Such a model includes two probabilistic components, the first reflecting the association between a geologic condition (structural closure, for example) and the occurrence of oil, and the second reflecting the uncertainty associated with the estimation of geologic variables in areas of limited control. Estimates of the conditional relationship between geologic variables and subsequent production can be found by analyzing the exploration history of a "training area" judged to be geologically similar to the exploration area. The geologic variables are assessed over the training area using an historical subset of the available data, whose density corresponds to the present control density in the exploration area. The success or failure of wells drilled in the training area subsequent to the time corresponding to the historical subset provides empirical estimates of the probability of success conditional upon geology. Uncertainty in perception of geological conditions may be estimated from the distribution of errors made in geologic assessment using the historical subset of control wells. These errors may be expressed as a linear function of distance from available control. Alternatively, the uncertainty may be found by calculating the semivariogram of the geologic variables used in the analysis: the two procedures will yield approximately equivalent results. The empirical probability functions may then be transferred to the exploration area and used to estimate the likelihood of success of specific exploration plays. These estimates will reflect both the conditional relationship between the geological variables used to guide exploration and the uncertainty resulting from lack of control. The technique is illustrated with case histories from the mid-Continent area of the U.S.A. ?? 1977 Plenum Publishing Corp.
Estimation of Uncertainties in Stage-Discharge Curve for an Experimental Himalayan Watershed
NASA Astrophysics Data System (ADS)
Kumar, V.; Sen, S.
2016-12-01
Various water resource projects developed on rivers originating from the Himalayan region, the "Water Tower of Asia", plays an important role on downstream development. Flow measurements at the desired river site are very critical for river engineers and hydrologists for water resources planning and management, flood forecasting, reservoir operation and flood inundation studies. However, an accurate discharge assessment of these mountainous rivers is costly, tedious and frequently dangerous to operators during flood events. Currently, in India, discharge estimation is linked to stage-discharge relationship known as rating curve. This relationship would be affected by a high degree of uncertainty. Estimating the uncertainty of rating curve remains a relevant challenge because it is not easy to parameterize. Main source of rating curve uncertainty are errors because of incorrect discharge measurement, variation in hydraulic conditions and depth measurement. In this study our objective is to obtain best parameters of rating curve that fit the limited record of observations and to estimate uncertainties at different depth obtained from rating curve. The rating curve parameters of standard power law are estimated for three different streams of Aglar watershed located in lesser Himalayas by maximum-likelihood estimator. Quantification of uncertainties in the developed rating curves is obtained from the estimate of variances and covariances of the rating curve parameters. Results showed that the uncertainties varied with catchment behavior with error varies between 0.006-1.831 m3/s. Discharge uncertainty in the Aglar watershed streams significantly depend on the extent of extrapolation outside the range of observed water levels. Extrapolation analysis confirmed that more than 15% for maximum discharges and 5% for minimum discharges are not strongly recommended for these mountainous gauging sites.
Aulenbach, Brent T.
2013-01-01
A regression-model based approach is a commonly used, efficient method for estimating streamwater constituent load when there is a relationship between streamwater constituent concentration and continuous variables such as streamwater discharge, season and time. A subsetting experiment using a 30-year dataset of daily suspended sediment observations from the Mississippi River at Thebes, Illinois, was performed to determine optimal sampling frequency, model calibration period length, and regression model methodology, as well as to determine the effect of serial correlation of model residuals on load estimate precision. Two regression-based methods were used to estimate streamwater loads, the Adjusted Maximum Likelihood Estimator (AMLE), and the composite method, a hybrid load estimation approach. While both methods accurately and precisely estimated loads at the model’s calibration period time scale, precisions were progressively worse at shorter reporting periods, from annually to monthly. Serial correlation in model residuals resulted in observed AMLE precision to be significantly worse than the model calculated standard errors of prediction. The composite method effectively improved upon AMLE loads for shorter reporting periods, but required a sampling interval of at least 15-days or shorter, when the serial correlations in the observed load residuals were greater than 0.15. AMLE precision was better at shorter sampling intervals and when using the shortest model calibration periods, such that the regression models better fit the temporal changes in the concentration–discharge relationship. The models with the largest errors typically had poor high flow sampling coverage resulting in unrepresentative models. Increasing sampling frequency and/or targeted high flow sampling are more efficient approaches to ensure sufficient sampling and to avoid poorly performing models, than increasing calibration period length.
2010-06-01
GMKPF represents a better and more flexible alternative to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ...accurate results relative to GML and EML when the network delays are modeled in terms of a single non-Gaussian/non-exponential distribution or as a...to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ) estimators for clock offset estimation in non-Gaussian or non
Nevo, Daniel; Zucker, David M.; Tamimi, Rulla M.; Wang, Molin
2017-01-01
A common paradigm in dealing with heterogeneity across tumors in cancer analysis is to cluster the tumors into subtypes using marker data on the tumor, and then to analyze each of the clusters separately. A more specific target is to investigate the association between risk factors and specific subtypes and to use the results for personalized preventive treatment. This task is usually carried out in two steps–clustering and risk factor assessment. However, two sources of measurement error arise in these problems. The first is the measurement error in the biomarker values. The second is the misclassification error when assigning observations to clusters. We consider the case with a specified set of relevant markers and propose a unified single-likelihood approach for normally distributed biomarkers. As an alternative, we consider a two-step procedure with the tumor type misclassification error taken into account in the second-step risk factor analysis. We describe our method for binary data and also for survival analysis data using a modified version of the Cox model. We present asymptotic theory for the proposed estimators. Simulation results indicate that our methods significantly lower the bias with a small price being paid in terms of variance. We present an analysis of breast cancer data from the Nurses’ Health Study to demonstrate the utility of our method. PMID:27558651
Joint Source-Channel Coding by Means of an Oversampled Filter Bank Code
NASA Astrophysics Data System (ADS)
Marinkovic, Slavica; Guillemot, Christine
2006-12-01
Quantized frame expansions based on block transforms and oversampled filter banks (OFBs) have been considered recently as joint source-channel codes (JSCCs) for erasure and error-resilient signal transmission over noisy channels. In this paper, we consider a coding chain involving an OFB-based signal decomposition followed by scalar quantization and a variable-length code (VLC) or a fixed-length code (FLC). This paper first examines the problem of channel error localization and correction in quantized OFB signal expansions. The error localization problem is treated as an[InlineEquation not available: see fulltext.]-ary hypothesis testing problem. The likelihood values are derived from the joint pdf of the syndrome vectors under various hypotheses of impulse noise positions, and in a number of consecutive windows of the received samples. The error amplitudes are then estimated by solving the syndrome equations in the least-square sense. The message signal is reconstructed from the corrected received signal by a pseudoinverse receiver. We then improve the error localization procedure by introducing a per-symbol reliability information in the hypothesis testing procedure of the OFB syndrome decoder. The per-symbol reliability information is produced by the soft-input soft-output (SISO) VLC/FLC decoders. This leads to the design of an iterative algorithm for joint decoding of an FLC and an OFB code. The performance of the algorithms developed is evaluated in a wavelet-based image coding system.
Maximum-likelihood estimation of parameterized wavefronts from multifocal data
Sakamoto, Julia A.; Barrett, Harrison H.
2012-01-01
A method for determining the pupil phase distribution of an optical system is demonstrated. Coefficients in a wavefront expansion were estimated using likelihood methods, where the data consisted of multiple irradiance patterns near focus. Proof-of-principle results were obtained in both simulation and experiment. Large-aberration wavefronts were handled in the numerical study. Experimentally, we discuss the handling of nuisance parameters. Fisher information matrices, Cramér-Rao bounds, and likelihood surfaces are examined. ML estimates were obtained by simulated annealing to deal with numerous local extrema in the likelihood function. Rapid processing techniques were employed to reduce the computational time. PMID:22772282
NASA Astrophysics Data System (ADS)
Langbein, J. O.
2016-12-01
Most time series of geophysical phenomena are contaminated with temporally correlated errors that limit the precision of any derived parameters. Ignoring temporal correlations will result in biased and unrealistic estimates of velocity and its error estimated from geodetic position measurements. Obtaining better estimates of uncertainties is limited by several factors, including selection of the correct model for the background noise and the computational requirements to estimate the parameters of the selected noise model when there are numerous observations. Here, I address the second problem of computational efficiency using maximum likelihood estimates (MLE). Most geophysical time series have background noise processes that can be represented as a combination of white and power-law noise, 1/fn , with frequency, f. Time domain techniques involving construction and inversion of large data covariance matrices are employed. Bos et al. [2012] demonstrate one technique that substantially increases the efficiency of the MLE methods, but it provides only an approximate solution for power-law indices greater than 1.0. That restriction can be removed by simply forming a data-filter that adds noise processes rather than combining them in quadrature. Consequently, the inversion of the data covariance matrix is simplified and it provides robust results for a wide range of power-law indices. With the new formulation, the efficiency is typically improved by about a factor of 8 over previous MLE algorithms [Langbein, 2004]. The new algorithm can be downloaded at http://earthquake.usgs.gov/research/software/#est_noise. The main program provides a number of basic functions that can be used to model the time-dependent part of time series and a variety of models that describe the temporal covariance of the data. In addition, the program is packaged with a few companion programs and scripts that can help with data analysis and with interpretation of the noise modeling.
Detecting Growth Shape Misspecifications in Latent Growth Models: An Evaluation of Fit Indexes
ERIC Educational Resources Information Center
Leite, Walter L.; Stapleton, Laura M.
2011-01-01
In this study, the authors compared the likelihood ratio test and fit indexes for detection of misspecifications of growth shape in latent growth models through a simulation study and a graphical analysis. They found that the likelihood ratio test, MFI, and root mean square error of approximation performed best for detecting model misspecification…
NASA Technical Reports Server (NTRS)
Lei, Ning; Chiang, Kwo-Fu; Oudrari, Hassan; Xiong, Xiaoxiong
2011-01-01
Optical sensors aboard Earth orbiting satellites such as the next generation Visible/Infrared Imager/Radiometer Suite (VIIRS) assume that the sensors radiometric response in the Reflective Solar Bands (RSB) is described by a quadratic polynomial, in relating the aperture spectral radiance to the sensor Digital Number (DN) readout. For VIIRS Flight Unit 1, the coefficients are to be determined before launch by an attenuation method, although the linear coefficient will be further determined on-orbit through observing the Solar Diffuser. In determining the quadratic polynomial coefficients by the attenuation method, a Maximum Likelihood approach is applied in carrying out the least-squares procedure. Crucial to the Maximum Likelihood least-squares procedure is the computation of the weight. The weight not only has a contribution from the noise of the sensor s digital count, with an important contribution from digitization error, but also is affected heavily by the mathematical expression used to predict the value of the dependent variable, because both the independent and the dependent variables contain random noise. In addition, model errors have a major impact on the uncertainties of the coefficients. The Maximum Likelihood approach demonstrates the inadequacy of the attenuation method model with a quadratic polynomial for the retrieved spectral radiance. We show that using the inadequate model dramatically increases the uncertainties of the coefficients. We compute the coefficient values and their uncertainties, considering both measurement and model errors.
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures
ERIC Educational Resources Information Center
Jeon, Minjeong; Rabe-Hesketh, Sophia
2012-01-01
In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
Estimation of proportions in mixed pixels through their region characterization
NASA Technical Reports Server (NTRS)
Chittineni, C. B. (Principal Investigator)
1981-01-01
A region of mixed pixels can be characterized through the probability density function of proportions of classes in the pixels. Using information from the spectral vectors of a given set of pixels from the mixed pixel region, expressions are developed for obtaining the maximum likelihood estimates of the parameters of probability density functions of proportions. The proportions of classes in the mixed pixels can then be estimated. If the mixed pixels contain objects of two classes, the computation can be reduced by transforming the spectral vectors using a transformation matrix that simultaneously diagonalizes the covariance matrices of the two classes. If the proportions of the classes of a set of mixed pixels from the region are given, then expressions are developed for obtaining the estmates of the parameters of the probability density function of the proportions of mixed pixels. Development of these expressions is based on the criterion of the minimum sum of squares of errors. Experimental results from the processing of remotely sensed agricultural multispectral imagery data are presented.
Bayesian inference for dynamic transcriptional regulation; the Hes1 system as a case study.
Heron, Elizabeth A; Finkenstädt, Bärbel; Rand, David A
2007-10-01
In this study, we address the problem of estimating the parameters of regulatory networks and provide the first application of Markov chain Monte Carlo (MCMC) methods to experimental data. As a case study, we consider a stochastic model of the Hes1 system expressed in terms of stochastic differential equations (SDEs) to which rigorous likelihood methods of inference can be applied. When fitting continuous-time stochastic models to discretely observed time series the lengths of the sampling intervals are important, and much of our study addresses the problem when the data are sparse. We estimate the parameters of an autoregulatory network providing results both for simulated and real experimental data from the Hes1 system. We develop an estimation algorithm using MCMC techniques which are flexible enough to allow for the imputation of latent data on a finer time scale and the presence of prior information about parameters which may be informed from other experiments as well as additional measurement error.
Multiple-Hit Parameter Estimation in Monolithic Detectors
Barrett, Harrison H.; Lewellen, Tom K.; Miyaoka, Robert S.
2014-01-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%–12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied. PMID:23193231
Unified framework to evaluate panmixia and migration direction among multiple sampling locations.
Beerli, Peter; Palczewski, Michal
2010-05-01
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
Shih, Weichung Joe; Li, Gang; Wang, Yining
2016-03-01
Sample size plays a crucial role in clinical trials. Flexible sample-size designs, as part of the more general category of adaptive designs that utilize interim data, have been a popular topic in recent years. In this paper, we give a comparative review of four related methods for such a design. The likelihood method uses the likelihood ratio test with an adjusted critical value. The weighted method adjusts the test statistic with given weights rather than the critical value. The dual test method requires both the likelihood ratio statistic and the weighted statistic to be greater than the unadjusted critical value. The promising zone approach uses the likelihood ratio statistic with the unadjusted value and other constraints. All four methods preserve the type-I error rate. In this paper we explore their properties and compare their relationships and merits. We show that the sample size rules for the dual test are in conflict with the rules of the promising zone approach. We delineate what is necessary to specify in the study protocol to ensure the validity of the statistical procedure and what can be kept implicit in the protocol so that more flexibility can be attained for confirmatory phase III trials in meeting regulatory requirements. We also prove that under mild conditions, the likelihood ratio test still preserves the type-I error rate when the actual sample size is larger than the re-calculated one. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Wang, Shugong; Liang, Xu
2013-01-01
A new approach is presented in this paper to effectively obtain parameter estimations for the Multiscale Kalman Smoother (MKS) algorithm. This new approach has demonstrated promising potentials in deriving better data products based on data of different spatial scales and precisions. Our new approach employs a multi-objective (MO) parameter estimation scheme (called MO scheme hereafter), rather than using the conventional maximum likelihood scheme (called ML scheme) to estimate the MKS parameters. Unlike the ML scheme, the MO scheme is not simply built on strict statistical assumptions related to prediction errors and observation errors, rather, it directly associates the fused data of multiple scales with multiple objective functions in searching best parameter estimations for MKS through optimization. In the MO scheme, objective functions are defined to facilitate consistency among the fused data at multiscales and the input data at their original scales in terms of spatial patterns and magnitudes. The new approach is evaluated through a Monte Carlo experiment and a series of comparison analyses using synthetic precipitation data. Our results show that the MKS fused precipitation performs better using the MO scheme than that using the ML scheme. Particularly, improvements are significant compared to that using the ML scheme for the fused precipitation associated with fine spatial resolutions. This is mainly due to having more criteria and constraints involved in the MO scheme than those included in the ML scheme. The weakness of the original ML scheme that blindly puts more weights onto the data associated with finer resolutions is overcome in our new approach.
Assessing Interval Estimation Methods for Hill Model ...
The Hill model of concentration-response is ubiquitous in toxicology, perhaps because its parameters directly relate to biologically significant metrics of toxicity such as efficacy and potency. Point estimates of these parameters obtained through least squares regression or maximum likelihood are commonly used in high-throughput risk assessment, but such estimates typically fail to include reliable information concerning confidence in (or precision of) the estimates. To address this issue, we examined methods for assessing uncertainty in Hill model parameter estimates derived from concentration-response data. In particular, using a sample of ToxCast concentration-response data sets, we applied four methods for obtaining interval estimates that are based on asymptotic theory, bootstrapping (two varieties), and Bayesian parameter estimation, and then compared the results. These interval estimation methods generally did not agree, so we devised a simulation study to assess their relative performance. We generated simulated data by constructing four statistical error models capable of producing concentration-response data sets comparable to those observed in ToxCast. We then applied the four interval estimation methods to the simulated data and compared the actual coverage of the interval estimates to the nominal coverage (e.g., 95%) in order to quantify performance of each of the methods in a variety of cases (i.e., different values of the true Hill model paramet
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
NASA Astrophysics Data System (ADS)
Sutawanir
2015-12-01
Mortality tables play important role in actuarial studies such as life annuities, premium determination, premium reserve, valuation pension plan, pension funding. Some known mortality tables are CSO mortality table, Indonesian Mortality Table, Bowers mortality table, Japan Mortality table. For actuary applications some tables are constructed with different environment such as single decrement, double decrement, and multiple decrement. There exist two approaches in mortality table construction : mathematics approach and statistical approach. Distribution model and estimation theory are the statistical concepts that are used in mortality table construction. This article aims to discuss the statistical approach in mortality table construction. The distributional assumptions are uniform death distribution (UDD) and constant force (exponential). Moment estimation and maximum likelihood are used to estimate the mortality parameter. Moment estimation methods are easier to manipulate compared to maximum likelihood estimation (mle). However, the complete mortality data are not used in moment estimation method. Maximum likelihood exploited all available information in mortality estimation. Some mle equations are complicated and solved using numerical methods. The article focus on single decrement estimation using moment and maximum likelihood estimation. Some extension to double decrement will introduced. Simple dataset will be used to illustrated the mortality estimation, and mortality table.
Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation
NASA Astrophysics Data System (ADS)
Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.
2016-12-01
With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
Simulation: learning from mistakes while building communication and teamwork.
Kuehster, Christina R; Hall, Carla D
2010-01-01
Medical errors are one of the leading causes of death annually in the United States. Many of these errors are related to poor communication and/or lack of teamwork. Using simulation as a teaching modality provides a dual role in helping to reduce these errors. Thorough integration of clinical practice with teamwork and communication in a safe environment increases the likelihood of reducing the error rates in medicine. By allowing practitioners to make potential errors in a safe environment, such as simulation, these valuable lessons improve retention and will rarely be repeated.
Carnes, Debra; Kilpatrick, Sue; Iedema, Rick
2015-12-01
This study aims to determine the likelihood that rural nurses perceive a hypothetical medication error would be reported in their workplace. This employs cross-sectional survey using hypothetical error scenario with varying levels of harm. Clinical settings in rural Tasmania. Participants were 116 eligible surveys received from registered and enrolled nurses. Frequency of responses indicating the likelihood that severe, moderate and near miss (no harm) scenario would 'always' be reported or disclosed. Eighty per cent of nurses viewed a severe error would 'always' be reported, 64.8% a moderate error and 45.7% a near-miss error. In regards to disclosure, 54.7% felt this was 'always' likely to occur for a severe error, 44.8% for a moderate error and 26.4% for a near miss. Across all levels of severity, aged-care nurses were more likely than nurses in other settings to view error to 'always' be reported (ranging from 72-96%, P = 0.010 to 0.042,) and disclosed (68-88%, P = 0.000). Those in a management role were more likely to view error to 'always' be disclosed compared to those in a clinical role (50-77.3%, P = 0.008-0.024). Further research in rural clinical settings is needed to improve the understanding of error management and disclosure. © 2015 The Authors. Australian Journal of Rural Health published by Wiley Publishing Asia Pty Ltd on behalf of National Rural Health Alliance.
NASA Technical Reports Server (NTRS)
Thadani, S. G.
1977-01-01
The Maximum Likelihood Estimation of Signature Transformation (MLEST) algorithm is used to obtain maximum likelihood estimates (MLE) of affine transformation. The algorithm has been evaluated for three sets of data: simulated (training and recognition segment pairs), consecutive-day (data gathered from Landsat images), and geographical-extension (large-area crop inventory experiment) data sets. For each set, MLEST signature extension runs were made to determine MLE values and the affine-transformed training segment signatures were used to classify the recognition segments. The classification results were used to estimate wheat proportions at 0 and 1% threshold values.
Real-time antenna fault diagnosis experiments at DSS 13
NASA Technical Reports Server (NTRS)
Mellstrom, J.; Pierson, C.; Smyth, P.
1992-01-01
Experimental results obtained when a previously described fault diagnosis system was run online in real time at the 34-m beam waveguide antenna at Deep Space Station (DSS) 13 are described. Experimental conditions and the quality of results are described. A neural network model and a maximum-likelihood Gaussian classifier are compared with and without a Markov component to model temporal context. At the rate of a state update every 6.4 seconds, over a period of roughly 1 hour, the neural-Markov system had zero errors (incorrect state estimates) while monitoring both faulty and normal operations. The overall results indicate that the neural-Markov combination is the most accurate model and has significant practical potential.
Statistical methods for astronomical data with upper limits. I - Univariate distributions
NASA Technical Reports Server (NTRS)
Feigelson, E. D.; Nelson, P. I.
1985-01-01
The statistical treatment of univariate censored data is discussed. A heuristic derivation of the Kaplan-Meier maximum-likelihood estimator from first principles is presented which results in an expression amenable to analytic error analysis. Methods for comparing two or more censored samples are given along with simple computational examples, stressing the fact that most astronomical problems involve upper limits while the standard mathematical methods require lower limits. The application of univariate survival analysis to six data sets in the recent astrophysical literature is described, and various aspects of the use of survival analysis in astronomy, such as the limitations of various two-sample tests and the role of parametric modelling, are discussed.
Kirby, James B.; Bollen, Kenneth A.
2009-01-01
Structural Equation Modeling with latent variables (SEM) is a powerful tool for social and behavioral scientists, combining many of the strengths of psychometrics and econometrics into a single framework. The most common estimator for SEM is the full-information maximum likelihood estimator (ML), but there is continuing interest in limited information estimators because of their distributional robustness and their greater resistance to structural specification errors. However, the literature discussing model fit for limited information estimators for latent variable models is sparse compared to that for full information estimators. We address this shortcoming by providing several specification tests based on the 2SLS estimator for latent variable structural equation models developed by Bollen (1996). We explain how these tests can be used to not only identify a misspecified model, but to help diagnose the source of misspecification within a model. We present and discuss results from a Monte Carlo experiment designed to evaluate the finite sample properties of these tests. Our findings suggest that the 2SLS tests successfully identify most misspecified models, even those with modest misspecification, and that they provide researchers with information that can help diagnose the source of misspecification. PMID:20419054
Estimation of stochastic volatility with long memory for index prices of FTSE Bursa Malaysia KLCI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Kho Chia; Kane, Ibrahim Lawal; Rahman, Haliza Abd
In recent years, modeling in long memory properties or fractionally integrated processes in stochastic volatility has been applied in the financial time series. A time series with structural breaks can generate a strong persistence in the autocorrelation function, which is an observed behaviour of a long memory process. This paper considers the structural break of data in order to determine true long memory time series data. Unlike usual short memory models for log volatility, the fractional Ornstein-Uhlenbeck process is neither a Markovian process nor can it be easily transformed into a Markovian process. This makes the likelihood evaluation and parametermore » estimation for the long memory stochastic volatility (LMSV) model challenging tasks. The drift and volatility parameters of the fractional Ornstein-Unlenbeck model are estimated separately using the least square estimator (lse) and quadratic generalized variations (qgv) method respectively. Finally, the empirical distribution of unobserved volatility is estimated using the particle filtering with sequential important sampling-resampling (SIR) method. The mean square error (MSE) between the estimated and empirical volatility indicates that the performance of the model towards the index prices of FTSE Bursa Malaysia KLCI is fairly well.« less
Estimation of stochastic volatility with long memory for index prices of FTSE Bursa Malaysia KLCI
NASA Astrophysics Data System (ADS)
Chen, Kho Chia; Bahar, Arifah; Kane, Ibrahim Lawal; Ting, Chee-Ming; Rahman, Haliza Abd
2015-02-01
In recent years, modeling in long memory properties or fractionally integrated processes in stochastic volatility has been applied in the financial time series. A time series with structural breaks can generate a strong persistence in the autocorrelation function, which is an observed behaviour of a long memory process. This paper considers the structural break of data in order to determine true long memory time series data. Unlike usual short memory models for log volatility, the fractional Ornstein-Uhlenbeck process is neither a Markovian process nor can it be easily transformed into a Markovian process. This makes the likelihood evaluation and parameter estimation for the long memory stochastic volatility (LMSV) model challenging tasks. The drift and volatility parameters of the fractional Ornstein-Unlenbeck model are estimated separately using the least square estimator (lse) and quadratic generalized variations (qgv) method respectively. Finally, the empirical distribution of unobserved volatility is estimated using the particle filtering with sequential important sampling-resampling (SIR) method. The mean square error (MSE) between the estimated and empirical volatility indicates that the performance of the model towards the index prices of FTSE Bursa Malaysia KLCI is fairly well.
A mixed-effects regression model for longitudinal multivariate ordinal data.
Liu, Li C; Hedeker, Donald
2006-03-01
A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.
NASA Astrophysics Data System (ADS)
Silberman, L.; Dekel, A.; Eldar, A.; Zehavi, I.
2001-08-01
We allow for nonlinear effects in the likelihood analysis of galaxy peculiar velocities and obtain ~35% lower values for the cosmological density parameter Ωm and for the amplitude of mass density fluctuations σ8Ω0.6m. This result is obtained under the assumption that the power spectrum in the linear regime is of the flat ΛCDM model (h=0.65, n=1, COBE normalized) with only Ωm as a free parameter. Since the likelihood is driven by the nonlinear regime, we ``break'' the power spectrum at kb~0.2 (h-1 Mpc)-1 and fit a power law at k>kb. This allows for independent matching of the nonlinear behavior and an unbiased fit in the linear regime. The analysis assumes Gaussian fluctuations and errors and a linear relation between velocity and density. Tests using mock catalogs that properly simulate nonlinear effects demonstrate that this procedure results in a reduced bias and a better fit. We find for the Mark III and SFI data Ωm=0.32+/-0.06 and 0.37+/-0.09, respectively, with σ8Ω0.6m=0.49+/-0.06 and 0.63+/-0.08, in agreement with constraints from other data. The quoted 90% errors include distance errors and cosmic variance, for fixed values of the other parameters. The improvement in the likelihood due to the nonlinear correction is very significant for Mark III and moderately significant for SFI. When allowing deviations from ΛCDM, we find an indication for a wiggle in the power spectrum: an excess near k~0.05 (h-1 Mpc)-1 and a deficiency at k~0.1 (h-1 Mpc)-1, or a ``cold flow.'' This may be related to the wiggle seen in the power spectrum from redshift surveys and the second peak in the cosmic microwave background (CMB) anisotropy. A χ2 test applied to modes of a principal component analysis (PCA) shows that the nonlinear procedure improves the goodness of fit and reduces a spatial gradient that was of concern in the purely linear analysis. The PCA allows us to address spatial features of the data and to evaluate and fine-tune the theoretical and error models. It demonstrates in particular that the models used are appropriate for the cosmological parameter estimation performed. We address the potential for optimal data compression using PCA.
Heggeseth, Brianna C; Jewell, Nicholas P
2013-07-20
Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence-that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright © 2013 John Wiley & Sons, Ltd.
Causal Methods for Observational Research: A Primer.
Almasi-Hashiani, Amir; Nedjat, Saharnaz; Mansournia, Mohammad Ali
2018-04-01
The goal of many observational studies is to estimate the causal effect of an exposure on an outcome after adjustment for confounders, but there are still some serious errors in adjusting confounders in clinical journals. Standard regression modeling (e.g., ordinary logistic regression) fails to estimate the average effect of exposure in total population in the presence of interaction between exposure and covariates, and also cannot adjust for time-varying confounding appropriately. Moreover, stepwise algorithms of the selection of confounders based on P values may miss important confounders and lead to bias in effect estimates. Causal methods overcome these limitations. We illustrate three causal methods including inverse-probability-of-treatment-weighting (IPTW) and parametric g-formula, with an emphasis on a clever combination of these 2 methods: targeted maximum likelihood estimation (TMLE) which enjoys a double-robust property against bias. © 2018 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Extended Kalman Doppler tracking and model determination for multi-sensor short-range radar
NASA Astrophysics Data System (ADS)
Mittermaier, Thomas J.; Siart, Uwe; Eibert, Thomas F.; Bonerz, Stefan
2016-09-01
A tracking solution for collision avoidance in industrial machine tools based on short-range millimeter-wave radar Doppler observations is presented. At the core of the tracking algorithm there is an Extended Kalman Filter (EKF) that provides dynamic estimation and localization in real-time. The underlying sensor platform consists of several homodyne continuous wave (CW) radar modules. Based on In-phase-Quadrature (IQ) processing and down-conversion, they provide only Doppler shift information about the observed target. Localization with Doppler shift estimates is a nonlinear problem that needs to be linearized before the linear KF can be applied. The accuracy of state estimation depends highly on the introduced linearization errors, the initialization and the models that represent the true physics as well as the stochastic properties. The important issue of filter consistency is addressed and an initialization procedure based on data fitting and maximum likelihood estimation is suggested. Models for both, measurement and process noise are developed. Tracking results from typical three-dimensional courses of movement at short distances in front of a multi-sensor radar platform are presented.
ERIC Educational Resources Information Center
Jones, Douglas H.
The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…
Diagnostic decision-making and strategies to improve diagnosis.
Thammasitboon, Satid; Cutrer, William B
2013-10-01
A significant portion of diagnostic errors arises through cognitive errors resulting from inadequate knowledge, faulty data gathering, and/or faulty verification. Experts estimate that 75% of diagnostic failures can be attributed to clinician diagnostic thinking failure. The cognitive processes that underlie diagnostic thinking of clinicians are complex and intriguing, and it is imperative that clinicians acquire explicit appreciation and application of different cognitive approaches to make decisions better. A dual-process model that unifies many theories of decision-making has emerged as a promising template for understanding how clinicians think and judge efficiently in a diagnostic reasoning process. The identification and implementation of strategies for decreasing or preventing such diagnostic errors has become a growing area of interest and research. Suggested strategies to decrease diagnostic error incidence include increasing clinician's clinical expertise and avoiding inherent cognitive errors to make decisions better. Implementing Interventions focused solely on avoiding errors may work effectively for patient safety issues such as medication errors. Addressing cognitive errors, however, requires equal effort on expanding the individual clinician's expertise. Providing cognitive support to clinicians for robust diagnostic decision-making serves as the final strategic target for decreasing diagnostic errors. Clinical guidelines and algorithms offer another method for streamlining decision-making and decreasing likelihood of cognitive diagnostic errors. Addressing cognitive processing errors is undeniably the most challenging task in reducing diagnostic errors. While many suggested approaches exist, they are mostly based on theories and sciences in cognitive psychology, decision-making, and education. The proposed interventions are primarily suggestions and very few of them have been tested in the actual practice settings. Collaborative research effort is required to effectively address cognitive processing errors. Researchers in various areas, including patient safety/quality improvement, decision-making, and problem solving, must work together to make medical diagnosis more reliable. © 2013 Mosby, Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less
Lu, Dan; Ricciuto, Daniel M.; Walker, Anthony P.; ...
2017-09-27
Calibration of terrestrial ecosystem models is important but challenging. Bayesian inference implemented by Markov chain Monte Carlo (MCMC) sampling provides a comprehensive framework to estimate model parameters and associated uncertainties using their posterior distributions. The effectiveness and efficiency of the method strongly depend on the MCMC algorithm used. In this work, a differential evolution adaptive Metropolis (DREAM) algorithm is used to estimate posterior distributions of 21 parameters for the data assimilation linked ecosystem carbon (DALEC) model using 14 years of daily net ecosystem exchange data collected at the Harvard Forest Environmental Measurement Site eddy-flux tower. The calibration of DREAM results inmore » a better model fit and predictive performance compared to the popular adaptive Metropolis (AM) scheme. Moreover, DREAM indicates that two parameters controlling autumn phenology have multiple modes in their posterior distributions while AM only identifies one mode. The application suggests that DREAM is very suitable to calibrate complex terrestrial ecosystem models, where the uncertain parameter size is usually large and existence of local optima is always a concern. In addition, this effort justifies the assumptions of the error model used in Bayesian calibration according to the residual analysis. Here, the result indicates that a heteroscedastic, correlated, Gaussian error model is appropriate for the problem, and the consequent constructed likelihood function can alleviate the underestimation of parameter uncertainty that is usually caused by using uncorrelated error models.« less
Lobach, Irvna; Fan, Ruzone; Carroll, Raymond T.
2011-01-01
With the advent of dense single nucleotide polymorphism genotyping, population-based association studies have become the major tools for identifying human disease genes and for fine gene mapping of complex traits. We develop a genotype-based approach for association analysis of case-control studies of gene-environment interactions in the case when environmental factors are measured with error and genotype data are available on multiple genetic markers. To directly use the observed genotype data, we propose two genotype-based models: genotype effect and additive effect models. Our approach offers several advantages. First, the proposed risk functions can directly incorporate the observed genotype data while modeling the linkage disequihbrium information in the regression coefficients, thus eliminating the need to infer haplotype phase. Compared with the haplotype-based approach, an estimating procedure based on the proposed methods can be much simpler and significantly faster. In addition, there is no potential risk due to haplotype phase estimation. Further, by fitting the proposed models, it is possible to analyze the risk alleles/variants of complex diseases, including their dominant or additive effects. To model measurement error, we adopt the pseudo-likelihood method by Lobach et al. [2008]. Performance of the proposed method is examined using simulation experiments. An application of our method is illustrated using a population-based case-control study of association between calcium intake with the risk of colorectal adenoma development. PMID:21031455
Verveer, P. J; Gemkow, M. J; Jovin, T. M
1999-01-01
We have compared different image restoration approaches for fluorescence microscopy. The most widely used algorithms were classified with a Bayesian theory according to the assumed noise model and the type of regularization imposed. We considered both Gaussian and Poisson models for the noise in combination with Tikhonov regularization, entropy regularization, Good's roughness and without regularization (maximum likelihood estimation). Simulations of fluorescence confocal imaging were used to examine the different noise models and regularization approaches using the mean squared error criterion. The assumption of a Gaussian noise model yielded only slightly higher errors than the Poisson model. Good's roughness was the best choice for the regularization. Furthermore, we compared simulated confocal and wide-field data. In general, restored confocal data are superior to restored wide-field data, but given sufficient higher signal level for the wide-field data the restoration result may rival confocal data in quality. Finally, a visual comparison of experimental confocal and wide-field data is presented.
Sreenivas, K; Sekhar, N Seshadri; Saxena, Manoj; Paliwal, R; Pathak, S; Porwal, M C; Fyzee, M A; Rao, S V C Kameswara; Wadodkar, M; Anasuya, T; Murthy, M S R; Ravisankar, T; Dadhwal, V K
2015-09-15
The present study aims at analysis of spatial and temporal variability in agricultural land cover during 2005-6 and 2011-12 from an ongoing program of annual land use mapping using multidate Advanced Wide Field Sensor (AWiFS) data aboard Resourcesat-1 and 2. About 640-690 multi-temporal AWiFS quadrant data products per year (depending on cloud cover) were co-registered and radiometrically normalized to prepare state (administrative unit) mosaics. An 18-fold classification was adopted in this project. Rule-based techniques along with maximum-likelihood algorithm were employed to deriving land cover information as well as changes within agricultural land cover classes. The agricultural land cover classes include - kharif (June-October), rabi (November-April), zaid (April-June), area sown more than once, fallow lands and plantation crops. Mean kappa accuracy of these estimates varied from 0.87 to 0.96 for various classes. Standard error of estimate has been computed for each class annually and the area estimates were corrected using standard error of estimate. The corrected estimates range between 99 and 116 Mha for kharif and 77-91 Mha for rabi. The kharif, rabi and net sown area were aggregated at 10 km × 10 km grid on annual basis for entire India and CV was computed at each grid cell using temporal spatially-aggregated area as input. This spatial variability of agricultural land cover classes was analyzed across meteorological zones, irrigated command areas and administrative boundaries. The results indicate that out of various states/meteorological zones, Punjab was consistently cropped during kharif as well as rabi seasons. Out of all irrigated commands, Tawa irrigated command was consistently cropped during rabi season. Copyright © 2014 Elsevier Ltd. All rights reserved.
Lloyd, Andrew; Kerr, Cicely; Breheny, Katie; Brazier, John; Ortiz, Aurora; Borg, Emma
2014-03-01
Condition-specific preference-based measures can offer utility data where they would not otherwise be available or where generic measures may lack sensitivity, although they lack comparability across conditions. This study aimed to develop an algorithm for estimating utilities from the short bowel syndrome health-related quality of life scale (SBS-QoL™). SBS-QoL™ items were selected based on factor and item performance analysis of a European SBS-QoL™ dataset and consultation with 3 SBS clinical experts. Six-dimension health states were developed using 8 SBS-QoL™ items (2 dimensions combined 2 SBS-QoL™ items). SBS health states were valued by a UK general population sample (N = 250) using the lead-time time trade-off method. Preference weights or 'utility decrements' for each severity level of each dimension were estimated by regression models and used to develop the scoring algorithm. Mean utilities for the SBS health states ranged from -0.46 (worst health state, very much affected on all dimensions) to 0.92 (best health state, not at all affected on all dimensions). The random effects model with maximum likelihood estimation regression had the best predictive ability and lowest root mean squared error and mean absolute error, and was used to develop the scoring algorithm. The preference-weighted scoring algorithm for the SBS-QoL™ developed is able to estimate a wide range of utility values from patient-level SBS-QoL™ data. This allows estimation of SBS HRQL impact for the purpose of economic evaluation of SBS treatment benefits.
Vigan, Marie; Stirnemann, Jérôme; Mentré, France
2014-05-01
Analysis of repeated time-to-event data is increasingly performed in pharmacometrics using parametric frailty models. The aims of this simulation study were (1) to assess estimation performance of Stochastic Approximation Expectation Maximization (SAEM) algorithm in MONOLIX, Adaptive Gaussian Quadrature (AGQ), and Laplace algorithm in PROC NLMIXED of SAS and (2) to evaluate properties of test of a dichotomous covariate on occurrence of events. The simulation setting is inspired from an analysis of occurrence of bone events after the initiation of treatment by imiglucerase in patients with Gaucher Disease (GD). We simulated repeated events with an exponential model and various dropout rates: no, low, or high. Several values of baseline hazard model, variability, number of subject, and effect of covariate were studied. For each scenario, 100 datasets were simulated for estimation performance and 500 for test performance. We evaluated estimation performance through relative bias and relative root mean square error (RRMSE). We studied properties of Wald and likelihood ratio test (LRT). We used these methods to analyze occurrence of bone events in patients with GD after starting an enzyme replacement therapy. SAEM with three chains and AGQ algorithms provided good estimates of parameters much better than SAEM with one chain and Laplace which often provided poor estimates. Despite a small number of repeated events, SAEM with three chains and AGQ gave small biases and RRMSE. Type I errors were closed to 5%, and power varied as expected for SAEM with three chains and AGQ. Probability of having at least one event under treatment was 19.1%.
Computation of nonparametric convex hazard estimators via profile methods.
Jankowski, Hanna K; Wellner, Jon A
2009-05-01
This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females.
ERIC Educational Resources Information Center
Penfield, Randall D.; Bergeron, Jennifer M.
2005-01-01
This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
Rate of convergence of k-step Newton estimators to efficient likelihood estimators
Steve Verrill
2007-01-01
We make use of Cramer conditions together with the well-known local quadratic convergence of Newton?s method to establish the asymptotic closeness of k-step Newton estimators to efficient likelihood estimators. In Verrill and Johnson [2007. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. USDA Forest Products Laboratory Research...
Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L
2005-12-01
Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gopich, Irina V.
2015-01-21
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when themore » FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated.« less
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
Gopich, Irina V.
2015-01-01
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when the FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated. PMID:25612692
Schwappach, David L. B.; Gehring, Katrin
2014-01-01
Purpose To investigate the likelihood of speaking up about patient safety in oncology and to clarify the effect of clinical and situational context factors on the likelihood of voicing concerns. Patients and Methods 1013 nurses and doctors in oncology rated four clinical vignettes describing coworkers’ errors and rule violations in a self-administered factorial survey (65% response rate). Multiple regression analysis was used to model the likelihood of speaking up as outcome of vignette attributes, responder’s evaluations of the situation and personal characteristics. Results Respondents reported a high likelihood of speaking up about patient safety but the variation between and within types of errors and rule violations was substantial. Staff without managerial function provided significantly higher levels of decision difficulty and discomfort to speak up. Based on the information presented in the vignettes, 74%−96% would speak up towards a supervisor failing to check a prescription, 45%−81% would point a coworker to a missed hand disinfection, 82%−94% would speak up towards nurses who violate a safety rule in medication preparation, and 59%−92% would question a doctor violating a safety rule in lumbar puncture. Several vignette attributes predicted the likelihood of speaking up. Perceived potential harm, anticipated discomfort, and decision difficulty were significant predictors of the likelihood of speaking up. Conclusions Clinicians’ willingness to speak up about patient safety is considerably affected by contextual factors. Physicians and nurses without managerial function report substantial discomfort with speaking up. Oncology departments should provide staff with clear guidance and trainings on when and how to voice safety concerns. PMID:25116338
Won, Jongsung; Cheng, Jack C P; Lee, Ghang
2016-03-01
Waste generated in construction and demolition processes comprised around 50% of the solid waste in South Korea in 2013. Many cases show that design validation based on building information modeling (BIM) is an effective means to reduce the amount of construction waste since construction waste is mainly generated due to improper design and unexpected changes in the design and construction phases. However, the amount of construction waste that could be avoided by adopting BIM-based design validation has been unknown. This paper aims to estimate the amount of construction waste prevented by a BIM-based design validation process based on the amount of construction waste that might be generated due to design errors. Two project cases in South Korea were studied in this paper, with 381 and 136 design errors detected, respectively during the BIM-based design validation. Each design error was categorized according to its cause and the likelihood of detection before construction. The case studies show that BIM-based design validation could prevent 4.3-15.2% of construction waste that might have been generated without using BIM. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Casabianca, Jodi M.; Lewis, Charles
2015-01-01
Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
NASA Astrophysics Data System (ADS)
Heavens, A. F.; Seikel, M.; Nord, B. D.; Aich, M.; Bouffanais, Y.; Bassett, B. A.; Hobson, M. P.
2014-12-01
The Fisher Information Matrix formalism (Fisher 1935) is extended to cases where the data are divided into two parts (X, Y), where the expectation value of Y depends on X according to some theoretical model, and X and Y both have errors with arbitrary covariance. In the simplest case, (X, Y) represent data pairs of abscissa and ordinate, in which case the analysis deals with the case of data pairs with errors in both coordinates, but X can be any measured quantities on which Y depends. The analysis applies for arbitrary covariance, provided all errors are Gaussian, and provided the errors in X are small, both in comparison with the scale over which the expected signal Y changes, and with the width of the prior distribution. This generalizes the Fisher Matrix approach, which normally only considers errors in the `ordinate' Y. In this work, we include errors in X by marginalizing over latent variables, effectively employing a Bayesian hierarchical model, and deriving the Fisher Matrix for this more general case. The methods here also extend to likelihood surfaces which are not Gaussian in the parameter space, and so techniques such as DALI (Derivative Approximation for Likelihoods) can be generalized straightforwardly to include arbitrary Gaussian data error covariances. For simple mock data and theoretical models, we compare to Markov Chain Monte Carlo experiments, illustrating the method with cosmological supernova data. We also include the new method in the FISHER4CAST software.
An Improved Nested Sampling Algorithm for Model Selection and Assessment
NASA Astrophysics Data System (ADS)
Zeng, X.; Ye, M.; Wu, J.; WANG, D.
2017-12-01
Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
The small low SNR target tracking using sparse representation information
NASA Astrophysics Data System (ADS)
Yin, Lifan; Zhang, Yiqun; Wang, Shuo; Sun, Chenggang
2017-11-01
Tracking small targets, such as missile warheads, from a remote distance is a difficult task since the targets are "points" which are similar to sensor's noise points. As a result, traditional tracking algorithms only use the information contained in point measurement, such as the position information and intensity information, as characteristics to identify targets from noise points. But in fact, as a result of the diffusion of photon, any small target is not a point in the focal plane array and it occupies an area which is larger than one sensor cell. So, if we can take the geometry characteristic into account as a new dimension of information, it will be of helpful in distinguishing targets from noise points. In this paper, we use a novel method named sparse representation (SR) to depict the geometry information of target intensity and define it as the SR information of target. Modeling the intensity spread and solving its SR coefficients, the SR information is represented by establishing its likelihood function. Further, the SR information likelihood is incorporated in the conventional Probability Hypothesis Density (PHD) filter algorithm with point measurement. To illustrate the different performances of algorithm with or without the SR information, the detection capability and estimation error have been compared through simulation. Results demonstrate the proposed method has higher estimation accuracy and probability of detecting target than the conventional algorithm without the SR information.
A comparative review of methods for comparing means using partially paired data.
Guo, Beibei; Yuan, Ying
2017-06-01
In medical experiments with the objective of testing the equality of two means, data are often partially paired by design or because of missing data. The partially paired data represent a combination of paired and unpaired observations. In this article, we review and compare nine methods for analyzing partially paired data, including the two-sample t-test, paired t-test, corrected z-test, weighted t-test, pooled t-test, optimal pooled t-test, multiple imputation method, mixed model approach, and the test based on a modified maximum likelihood estimate. We compare the performance of these methods through extensive simulation studies that cover a wide range of scenarios with different effect sizes, sample sizes, and correlations between the paired variables, as well as true underlying distributions. The simulation results suggest that when the sample size is moderate, the test based on the modified maximum likelihood estimator is generally superior to the other approaches when the data is normally distributed and the optimal pooled t-test performs the best when the data is not normally distributed, with well-controlled type I error rates and high statistical power; when the sample size is small, the optimal pooled t-test is to be recommended when both variables have missing data and the paired t-test is to be recommended when only one variable has missing data.
Are volcanic seismic b-values high, and if so when?
NASA Astrophysics Data System (ADS)
Roberts, Nick S.; Bell, Andrew F.; Main, Ian G.
2015-12-01
The Gutenberg-Richter exponent b is a measure of the relative proportion of large and small earthquakes. It is commonly used to infer material properties such as heterogeneity, or mechanical properties such as the state of stress from earthquake populations. It is 'well known' that the b-value tends to be high or very high for volcanic earthquake populations relative to b = 1 for those of tectonic earthquakes, and that b varies significantly with time during periods of unrest. We first review the supporting evidence from 34 case studies, and identify weaknesses in this argument due predominantly to small sample size, the narrow bandwidth of magnitude scales available, variability in the methods used to assess the minimum or cutoff magnitude Mc, and to infer b. Informed by this, we use synthetic realisations to quantify the effect of choice of the cutoff magnitude on maximum likelihood estimates of b, and suggest a new work flow for this choice. We present the first quantitative estimate of the error in b introduced by uncertainties in estimating Mc, as a function of the number of events and the b-value itself. This error can significantly exceed the commonly-quoted statistical error in the estimated b-value, especially for the case that the underlying b-value is high. We apply the new methods to data sets from recent periods of unrest in El Hierro and Mount Etna. For El Hierro we confirm significantly high b-values of 1.5-2.5 prior to the 10 October 2011 eruption. For Mount Etna the b-values are indistinguishable from b = 1 within error, except during the flank eruptions at Mount Etna in 2001-2003, when 1.5 < b < 2.0. For the time period analysed, they are rarely lower than b = 1. Our results confirm that these volcano-tectonic earthquake populations can have systematically high b-values, especially when associated with eruptions. At other times they can be indistinguishable from those of tectonic earthquakes within the total error. The results have significant implications for operational forecasting informed by b-value variability, in particular in assessing the significance of b-value variations identified by sample sizes with fewer than 200 events above the completeness threshold.
Multiple robustness in factorized likelihood models.
Molina, J; Rotnitzky, A; Sued, M; Robins, J M
2017-09-01
We consider inference under a nonparametric or semiparametric model with likelihood that factorizes as the product of two or more variation-independent factors. We are interested in a finite-dimensional parameter that depends on only one of the likelihood factors and whose estimation requires the auxiliary estimation of one or several nuisance functions. We investigate general structures conducive to the construction of so-called multiply robust estimating functions, whose computation requires postulating several dimension-reducing models but which have mean zero at the true parameter value provided one of these models is correct.
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
Perceived barriers to medical-error reporting: an exploratory investigation.
Uribe, Claudia L; Schweikhart, Sharon B; Pathak, Dev S; Dow, Merrell; Marsh, Gail B
2002-01-01
Medical-error reporting is an essential component for patient safety enhancement. Unfortunately, medical errors are largely underreported across healthcare institutions. This problem can be attributed to different factors and barriers present at organizational and individual levels that ultimately prevent individuals from generating the report. This study explored the factors that affect medical-error reporting among physicians and nurses at a large academic medical center located in the midwest United States. A nominal group session was conducted to identify the most relevant factors that act as barriers for error reporting. These factors were then used to design a questionnaire that explored the likelihood of the factors to act as barriers and their likelihood to be modified. Using these two parameters, the results were analyzed and combined into a Factor Relevance Matrix. The matrix identifies the factors for which immediate actions should be undertaken to improve medical-error reporting (immediate action factors). It also identifies factors that require long-term strategies (long-term strategy factors) as well as factors that the organization should be aware of but that are of lower priority (awareness factors). The strategies outlined in this study may assist healthcare organizations in improving medical-error reporting, as part of the efforts toward patient-safety enhancement. Although factors affecting medical-error reporting may vary between different organizations, the process used in identifying the factors and the Factor Relevance Matrix developed in this study are easily adaptable to any organizational setting.
NASA Astrophysics Data System (ADS)
Mazzetti, S.; Giannini, V.; Russo, F.; Regge, D.
2018-05-01
Computer-aided diagnosis (CAD) systems are increasingly being used in clinical settings to report multi-parametric magnetic resonance imaging (mp-MRI) of the prostate. Usually, CAD systems automatically highlight cancer-suspicious regions to the radiologist, reducing reader variability and interpretation errors. Nevertheless, implementing this software requires the selection of which mp-MRI parameters can best discriminate between malignant and non-malignant regions. To exploit functional information, some parameters are derived from dynamic contrast-enhanced (DCE) acquisitions. In particular, much CAD software employs pharmacokinetic features, such as K trans and k ep, derived from the Tofts model, to estimate a likelihood map of malignancy. However, non-pharmacokinetic models can be also used to describe DCE-MRI curves, without any requirement for prior knowledge or measurement of the arterial input function, which could potentially lead to large errors in parameter estimation. In this work, we implemented an empirical function derived from the phenomenological universalities (PUN) class to fit DCE-MRI. The parameters of the PUN model are used in combination with T2-weighted and diffusion-weighted acquisitions to feed a support vector machine classifier to produce a voxel-wise malignancy likelihood map of the prostate. The results were all compared to those for a CAD system based on Tofts pharmacokinetic features to describe DCE-MRI curves, using different quality aspects of image segmentation, while also evaluating the number and size of false positive (FP) candidate regions. This study included 61 patients with 70 biopsy-proven prostate cancers (PCa). The metrics used to evaluate segmentation quality between the two CAD systems were not statistically different, although the PUN-based CAD reported a lower number of FP, with reduced size compared to the Tofts-based CAD. In conclusion, the CAD software based on PUN parameters is a feasible means with which to detect PCa, without affecting segmentation quality, and hence it could be successfully applied in clinical settings, improving the automated diagnosis process and reducing computational complexity.
Understanding seasonal variability of uncertainty in hydrological prediction
NASA Astrophysics Data System (ADS)
Li, M.; Wang, Q. J.
2012-04-01
Understanding uncertainty in hydrological prediction can be highly valuable for improving the reliability of streamflow prediction. In this study, a monthly water balance model, WAPABA, in a Bayesian joint probability with error models are presented to investigate the seasonal dependency of prediction error structure. A seasonal invariant error model, analogous to traditional time series analysis, uses constant parameters for model error and account for no seasonal variations. In contrast, a seasonal variant error model uses a different set of parameters for bias, variance and autocorrelation for each individual calendar month. Potential connection amongst model parameters from similar months is not considered within the seasonal variant model and could result in over-fitting and over-parameterization. A hierarchical error model further applies some distributional restrictions on model parameters within a Bayesian hierarchical framework. An iterative algorithm is implemented to expedite the maximum a posterior (MAP) estimation of a hierarchical error model. Three error models are applied to forecasting streamflow at a catchment in southeast Australia in a cross-validation analysis. This study also presents a number of statistical measures and graphical tools to compare the predictive skills of different error models. From probability integral transform histograms and other diagnostic graphs, the hierarchical error model conforms better to reliability when compared to the seasonal invariant error model. The hierarchical error model also generally provides the most accurate mean prediction in terms of the Nash-Sutcliffe model efficiency coefficient and the best probabilistic prediction in terms of the continuous ranked probability score (CRPS). The model parameters of the seasonal variant error model are very sensitive to each cross validation, while the hierarchical error model produces much more robust and reliable model parameters. Furthermore, the result of the hierarchical error model shows that most of model parameters are not seasonal variant except for error bias. The seasonal variant error model is likely to use more parameters than necessary to maximize the posterior likelihood. The model flexibility and robustness indicates that the hierarchical error model has great potential for future streamflow predictions.
NASA Astrophysics Data System (ADS)
Martinsson, J.
2013-03-01
We propose methods for robust Bayesian inference of the hypocentre in presence of poor, inconsistent and insufficient phase arrival times. The objectives are to increase the robustness, the accuracy and the precision by introducing heavy-tailed distributions and an informative prior distribution of the seismicity. The effects of the proposed distributions are studied under real measurement conditions in two underground mine networks and validated using 53 blasts with known hypocentres. To increase the robustness against poor, inconsistent or insufficient arrivals, a Gaussian Mixture Model is used as a hypocentre prior distribution to describe the seismically active areas, where the parameters are estimated based on previously located events in the region. The prior is truncated to constrain the solution to valid geometries, for example below the ground surface, excluding known cavities, voids and fractured zones. To reduce the sensitivity to outliers, different heavy-tailed distributions are evaluated to model the likelihood distribution of the arrivals given the hypocentre and the origin time. Among these distributions, the multivariate t-distribution is shown to produce the overall best performance, where the tail-mass adapts to the observed data. Hypocentre and uncertainty region estimates are based on simulations from the posterior distribution using Markov Chain Monte Carlo techniques. Velocity graphs (equivalent to traveltime graphs) are estimated using blasts from known locations, and applied to reduce the main uncertainties and thereby the final estimation error. To focus on the behaviour and the performance of the proposed distributions, a basic single-event Bayesian procedure is considered in this study for clarity. Estimation results are shown with different distributions, with and without prior distribution of seismicity, with wrong prior distribution, with and without error compensation, with and without error description, with insufficient arrival times and in presence of significant outliers. A particular focus is on visual results and comparisons to give a better understanding of the Bayesian advantage and to show the effects of heavy-tailed distributions and informative prior information on real data.
Eickhoff, Simon B; Nichols, Thomas E; Laird, Angela R; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T; Bzdok, Danilo; Eickhoff, Claudia R
2016-08-15
Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Pankow, C.; Brady, P.; Ochsner, E.; O'Shaughnessy, R.
2015-07-01
We introduce a highly parallelizable architecture for estimating parameters of compact binary coalescence using gravitational-wave data and waveform models. Using a spherical harmonic mode decomposition, the waveform is expressed as a sum over modes that depend on the intrinsic parameters (e.g., masses) with coefficients that depend on the observer dependent extrinsic parameters (e.g., distance, sky position). The data is then prefiltered against those modes, at fixed intrinsic parameters, enabling efficiently evaluation of the likelihood for generic source positions and orientations, independent of waveform length or generation time. We efficiently parallelize our intrinsic space calculation by integrating over all extrinsic parameters using a Monte Carlo integration strategy. Since the waveform generation and prefiltering happens only once, the cost of integration dominates the procedure. Also, we operate hierarchically, using information from existing gravitational-wave searches to identify the regions of parameter space to emphasize in our sampling. As proof of concept and verification of the result, we have implemented this algorithm using standard time-domain waveforms, processing each event in less than one hour on recent computing hardware. For most events we evaluate the marginalized likelihood (evidence) with statistical errors of ≲5 %, and even smaller in many cases. With a bounded runtime independent of the waveform model starting frequency, a nearly unchanged strategy could estimate neutron star (NS)-NS parameters in the 2018 advanced LIGO era. Our algorithm is usable with any noise curve and existing time-domain model at any mass, including some waveforms which are computationally costly to evolve.
Log-normal frailty models fitted as Poisson generalized linear mixed models.
Hirsch, Katharina; Wienke, Andreas; Kuss, Oliver
2016-12-01
The equivalence of a survival model with a piecewise constant baseline hazard function and a Poisson regression model has been known since decades. As shown in recent studies, this equivalence carries over to clustered survival data: A frailty model with a log-normal frailty term can be interpreted and estimated as a generalized linear mixed model with a binary response, a Poisson likelihood, and a specific offset. Proceeding this way, statistical theory and software for generalized linear mixed models are readily available for fitting frailty models. This gain in flexibility comes at the small price of (1) having to fix the number of pieces for the baseline hazard in advance and (2) having to "explode" the data set by the number of pieces. In this paper we extend the simulations of former studies by using a more realistic baseline hazard (Gompertz) and by comparing the model under consideration with competing models. Furthermore, the SAS macro %PCFrailty is introduced to apply the Poisson generalized linear mixed approach to frailty models. The simulations show good results for the shared frailty model. Our new %PCFrailty macro provides proper estimates, especially in case of 4 events per piece. The suggested Poisson generalized linear mixed approach for log-normal frailty models based on the %PCFrailty macro provides several advantages in the analysis of clustered survival data with respect to more flexible modelling of fixed and random effects, exact (in the sense of non-approximate) maximum likelihood estimation, and standard errors and different types of confidence intervals for all variance parameters. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Eickhoff, Simon B.; Nichols, Thomas E.; Laird, Angela R.; Hoffstaedter, Felix; Amunts, Katrin; Fox, Peter T.
2016-01-01
Given the increasing number of neuroimaging publications, the automated knowledge extraction on brain-behavior associations by quantitative meta-analyses has become a highly important and rapidly growing field of research. Among several methods to perform coordinate-based neuroimaging meta-analyses, Activation Likelihood Estimation (ALE) has been widely adopted. In this paper, we addressed two pressing questions related to ALE meta-analysis: i) Which thresholding method is most appropriate to perform statistical inference? ii) Which sample size, i.e., number of experiments, is needed to perform robust meta-analyses? We provided quantitative answers to these questions by simulating more than 120,000 meta-analysis datasets using empirical parameters (i.e., number of subjects, number of reported foci, distribution of activation foci) derived from the BrainMap database. This allowed to characterize the behavior of ALE analyses, to derive first power estimates for neuroimaging meta-analyses, and to thus formulate recommendations for future ALE studies. We could show as a first consequence that cluster-level family-wise error (FWE) correction represents the most appropriate method for statistical inference, while voxel-level FWE correction is valid but more conservative. In contrast, uncorrected inference and false-discovery rate correction should be avoided. As a second consequence, researchers should aim to include at least 20 experiments into an ALE meta-analysis to achieve sufficient power for moderate effects. We would like to note, though, that these calculations and recommendations are specific to ALE and may not be extrapolated to other approaches for (neuroimaging) meta-analysis. PMID:27179606
Data analysis in emission tomography using emission-count posteriors
NASA Astrophysics Data System (ADS)
Sitek, Arkadiusz
2012-11-01
A novel approach to the analysis of emission tomography data using the posterior probability of the number of emissions per voxel (emission count) conditioned on acquired tomographic data is explored. The posterior is derived from the prior and the Poisson likelihood of the emission-count data by marginalizing voxel activities. Based on emission-count posteriors, examples of Bayesian analysis including estimation and classification tasks in emission tomography are provided. The application of the method to computer simulations of 2D tomography is demonstrated. In particular, the minimum-mean-square-error point estimator of the emission count is demonstrated. The process of finding this estimator can be considered as a tomographic image reconstruction technique since the estimates of the number of emissions per voxel divided by voxel sensitivities and acquisition time are the estimates of the voxel activities. As an example of a classification task, a hypothesis stating that some region of interest (ROI) emitted at least or at most r-times the number of events in some other ROI is tested. The ROIs are specified by the user. The analysis described in this work provides new quantitative statistical measures that can be used in decision making in diagnostic imaging using emission tomography.
Coherent lidar design and performance verification
NASA Technical Reports Server (NTRS)
Frehlich, Rod
1993-01-01
The verification of LAWS beam alignment in space can be achieved by a measurement of heterodyne efficiency using the surface return. The crucial element is a direct detection signal that can be identified for each surface return. This should be satisfied for LAWS but will not be satisfied for descoped LAWS. The performance of algorithms for velocity estimation can be described with two basic parameters: the number of coherently detected photo-electrons per estimate and the number of independent signal samples per estimate. The average error of spectral domain velocity estimation algorithms are bounded by a new periodogram Cramer-Rao Bound. Comparison of the periodogram CRB with the exact CRB indicates a factor of two improvement in velocity accuracy is possible using non-spectral domain estimators. This improvement has been demonstrated with a maximum-likelihood estimator. The comparison of velocity estimation algorithms for 2 and 10 micron coherent lidar was performed by assuming all the system design parameters are fixed and the signal statistics are dominated by a 1 m/s rms wind fluctuation over the range gate. The beam alignment requirements for 2 micron are much more severe than for a 10 micron lidar. The effects of the random backscattered field on estimating the alignment error is a major problem for space based lidar operation, especially if the heterodyne efficiency cannot be estimated. For LAWS, the biggest science payoff would result from a short transmitted pulse, on the order of 0.5 microseconds instead of 3 microseconds. The numerically errors for simulation of laser propagation in the atmosphere have been determined as a joint project with the University of California, San Diego. Useful scaling laws were obtained for Kolmogorov atmospheric refractive turbulence and an atmospheric refractive turbulence characterized with an inner scale. This permits verification of the simulation procedure which is essential for the evaluation of the effects of refractive turbulence on coherent Doppler lidar systems. The analysis of 2 micron Doppler lidar data from Coherent Technologies, Inc. (CTI) has demonstrated many of the advantages of doppler lidar measurements of boundary layer winds. The effects of wind shear and wind turbulence over the pulse volume are probably the dominant source of the reduced performance. The effects of wind shear and wind turbulence on the statistical description of doppler lidar data has been derived and calculated.
NASA Astrophysics Data System (ADS)
Sykes, J. F.; Kang, M.; Thomson, N. R.
2007-12-01
The TCE release from The Lockformer Company in Lisle Illinois resulted in a plume in a confined aquifer that is more than 4 km long and impacted more than 300 residential wells. Many of the wells are on the fringe of the plume and have concentrations that did not exceed 5 ppb. The settlement for the Chapter 11 bankruptcy protection of Lockformer involved the establishment of a trust fund that compensates individuals with cancers with payments being based on cancer type, estimated TCE concentration in the well and the duration of exposure to TCE. The estimation of early arrival times and hence low likelihood events is critical in the determination of the eligibility of an individual for compensation. Thus, an emphasis must be placed on the accuracy of the leading tail region in the likelihood distribution of possible arrival times at a well. The estimation of TCE arrival time, using a three-dimensional analytical solution, involved parameter estimation and uncertainty analysis. Parameters in the model included TCE source parameters, groundwater velocities, dispersivities and the TCE decay coefficient for both the confining layer and the bedrock aquifer. Numerous objective functions, which include the well-known L2-estimator, robust estimators (L1-estimators and M-estimators), penalty functions, and dead zones, were incorporated in the parameter estimation process to treat insufficiencies in both the model and observational data due to errors, biases, and limitations. The concept of equifinality was adopted and multiple maximum likelihood parameter sets were accepted if pre-defined physical criteria were met. The criteria ensured that a valid solution predicted TCE concentrations for all TCE impacted areas. Monte Carlo samples are found to be inadequate for uncertainty analysis of this case study due to its inability to find parameter sets that meet the predefined physical criteria. Successful results are achieved using a Dynamically-Dimensioned Search sampling methodology that inherently accounts for parameter correlations and does not require assumptions regarding parameter distributions. For uncertainty analysis, multiple parameter sets were obtained using a modified Cauchy's M-estimator. Penalty functions had to be incorporated into the objective function definitions to generate a sufficient number of acceptable parameter sets. The combined effect of optimization and the application of the physical criteria perform the function of behavioral thresholds by reducing anomalies and by removing parameter sets with high objective function values. The factors that are important to the creation of an uncertainty envelope for TCE arrival at wells are outlined in the work. In general, greater uncertainty appears to be present at the tails of the distribution. For a refinement of the uncertainty envelopes, the application of additional physical criteria or behavioral thresholds is recommended.
PRECISE TULLY-FISHER RELATIONS WITHOUT GALAXY INCLINATIONS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Obreschkow, D.; Meyer, M.
2013-11-10
Power-law relations between tracers of baryonic mass and rotational velocities of disk galaxies, so-called Tully-Fisher relations (TFRs), offer a wealth of applications in galaxy evolution and cosmology. However, measurements of rotational velocities require galaxy inclinations, which are difficult to measure, thus limiting the range of TFR studies. This work introduces a maximum likelihood estimation (MLE) method for recovering the TFR in galaxy samples with limited or no information on inclinations. The robustness and accuracy of this method is demonstrated using virtual and real galaxy samples. Intriguingly, the MLE reliably recovers the TFR of all test samples, even without using anymore » inclination measurements—that is, assuming a random sin i-distribution for galaxy inclinations. Explicitly, this 'inclination-free MLE' recovers the three TFR parameters (zero-point, slope, scatter) with statistical errors only about 1.5 times larger than the best estimates based on perfectly known galaxy inclinations with zero uncertainty. Thus, given realistic uncertainties, the inclination-free MLE is highly competitive. If inclination measurements have mean errors larger than 10°, it is better not to use any inclinations than to consider the inclination measurements to be exact. The inclination-free MLE opens interesting perspectives for future H I surveys by the Square Kilometer Array and its pathfinders.« less