NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Unified framework to evaluate panmixia and migration direction among multiple sampling locations.
Beerli, Peter; Palczewski, Michal
2010-05-01
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty
Baele, Guy; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation
NASA Astrophysics Data System (ADS)
Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.
2016-12-01
With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
ERIC Educational Resources Information Center
Casabianca, Jodi M.; Lewis, Charles
2015-01-01
Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
Models and analysis for multivariate failure time data
NASA Astrophysics Data System (ADS)
Shih, Joanna Huang
The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
An Improved Nested Sampling Algorithm for Model Selection and Assessment
NASA Astrophysics Data System (ADS)
Zeng, X.; Ye, M.; Wu, J.; WANG, D.
2017-12-01
Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Yiu, Sean; Tom, Brian Dm
2017-01-01
Several researchers have described two-part models with patient-specific stochastic processes for analysing longitudinal semicontinuous data. In theory, such models can offer greater flexibility than the standard two-part model with patient-specific random effects. However, in practice, the high dimensional integrations involved in the marginal likelihood (i.e. integrated over the stochastic processes) significantly complicates model fitting. Thus, non-standard computationally intensive procedures based on simulating the marginal likelihood have so far only been proposed. In this paper, we describe an efficient method of implementation by demonstrating how the high dimensional integrations involved in the marginal likelihood can be computed efficiently. Specifically, by using a property of the multivariate normal distribution and the standard marginal cumulative distribution function identity, we transform the marginal likelihood so that the high dimensional integrations are contained in the cumulative distribution function of a multivariate normal distribution, which can then be efficiently evaluated. Hence, maximum likelihood estimation can be used to obtain parameter estimates and asymptotic standard errors (from the observed information matrix) of model parameters. We describe our proposed efficient implementation procedure for the standard two-part model parameterisation and when it is of interest to directly model the overall marginal mean. The methodology is applied on a psoriatic arthritis data set concerning functional disability.
ERIC Educational Resources Information Center
Paek, Insu; Wilson, Mark
2011-01-01
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
Baele, Guy; Lemey, Philippe; Vansteelandt, Stijn
2013-03-06
Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation.
2013-01-01
Background Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model’s marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. Results We here assess the original ‘model-switch’ path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model’s marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. Conclusions We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation. PMID:23497171
Marginal Maximum A Posteriori Item Parameter Estimation for the Generalized Graded Unfolding Model
ERIC Educational Resources Information Center
Roberts, James S.; Thompson, Vanessa M.
2011-01-01
A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…
Grummer, Jared A; Bryson, Robert W; Reeder, Tod W
2014-03-01
Current molecular methods of species delimitation are limited by the types of species delimitation models and scenarios that can be tested. Bayes factors allow for more flexibility in testing non-nested species delimitation models and hypotheses of individual assignment to alternative lineages. Here, we examined the efficacy of Bayes factors in delimiting species through simulations and empirical data from the Sceloporus scalaris species group. Marginal-likelihood scores of competing species delimitation models, from which Bayes factor values were compared, were estimated with four different methods: harmonic mean estimation (HME), smoothed harmonic mean estimation (sHME), path-sampling/thermodynamic integration (PS), and stepping-stone (SS) analysis. We also performed model selection using a posterior simulation-based analog of the Akaike information criterion through Markov chain Monte Carlo analysis (AICM). Bayes factor species delimitation results from the empirical data were then compared with results from the reversible-jump MCMC (rjMCMC) coalescent-based species delimitation method Bayesian Phylogenetics and Phylogeography (BP&P). Simulation results show that HME and sHME perform poorly compared with PS and SS marginal-likelihood estimators when identifying the true species delimitation model. Furthermore, Bayes factor delimitation (BFD) of species showed improved performance when species limits are tested by reassigning individuals between species, as opposed to either lumping or splitting lineages. In the empirical data, BFD through PS and SS analyses, as well as the rjMCMC method, each provide support for the recognition of all scalaris group taxa as independent evolutionary lineages. Bayes factor species delimitation and BP&P also support the recognition of three previously undescribed lineages. In both simulated and empirical data sets, harmonic and smoothed harmonic mean marginal-likelihood estimators provided much higher marginal-likelihood estimates than PS and SS estimators. The AICM displayed poor repeatability in both simulated and empirical data sets, and produced inconsistent model rankings across replicate runs with the empirical data. Our results suggest that species delimitation through the use of Bayes factors with marginal-likelihood estimates via PS or SS analyses provide a useful and complementary alternative to existing species delimitation methods.
A unifying framework for marginalized random intercept models of correlated binary outcomes
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian M.
2013-01-01
We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood-based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data with exchangeable correlation structures. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized random intercept models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate concepts. PMID:25342871
A New Monte Carlo Method for Estimating Marginal Likelihoods.
Wang, Yu-Bo; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O
2018-06-01
Evaluating the marginal likelihood in Bayesian analysis is essential for model selection. Estimators based on a single Markov chain Monte Carlo sample from the posterior distribution include the harmonic mean estimator and the inflated density ratio estimator. We propose a new class of Monte Carlo estimators based on this single Markov chain Monte Carlo sample. This class can be thought of as a generalization of the harmonic mean and inflated density ratio estimators using a partition weighted kernel (likelihood times prior). We show that our estimator is consistent and has better theoretical properties than the harmonic mean and inflated density ratio estimators. In addition, we provide guidelines on choosing optimal weights. Simulation studies were conducted to examine the empirical performance of the proposed estimator. We further demonstrate the desirable features of the proposed estimator with two real data sets: one is from a prostate cancer study using an ordinal probit regression model with latent variables; the other is for the power prior construction from two Eastern Cooperative Oncology Group phase III clinical trials using the cure rate survival model with similar objectives.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
Estimation of the Nonlinear Random Coefficient Model when Some Random Effects Are Separable
ERIC Educational Resources Information Center
du Toit, Stephen H. C.; Cudeck, Robert
2009-01-01
A method is presented for marginal maximum likelihood estimation of the nonlinear random coefficient model when the response function has some linear parameters. This is done by writing the marginal distribution of the repeated measures as a conditional distribution of the response given the nonlinear random effects. The resulting distribution…
ERIC Educational Resources Information Center
Kelderman, Henk
1992-01-01
Describes algorithms used in the computer program LOGIMO for obtaining maximum likelihood estimates of the parameters in loglinear models. These algorithms are also useful for the analysis of loglinear item-response theory models. Presents modified versions of the iterative proportional fitting and Newton-Raphson algorithms. Simulated data…
Estimating Model Probabilities using Thermodynamic Markov Chain Monte Carlo Methods
NASA Astrophysics Data System (ADS)
Ye, M.; Liu, P.; Beerli, P.; Lu, D.; Hill, M. C.
2014-12-01
Markov chain Monte Carlo (MCMC) methods are widely used to evaluate model probability for quantifying model uncertainty. In a general procedure, MCMC simulations are first conducted for each individual model, and MCMC parameter samples are then used to approximate marginal likelihood of the model by calculating the geometric mean of the joint likelihood of the model and its parameters. It has been found the method of evaluating geometric mean suffers from the numerical problem of low convergence rate. A simple test case shows that even millions of MCMC samples are insufficient to yield accurate estimation of the marginal likelihood. To resolve this problem, a thermodynamic method is used to have multiple MCMC runs with different values of a heating coefficient between zero and one. When the heating coefficient is zero, the MCMC run is equivalent to a random walk MC in the prior parameter space; when the heating coefficient is one, the MCMC run is the conventional one. For a simple case with analytical form of the marginal likelihood, the thermodynamic method yields more accurate estimate than the method of using geometric mean. This is also demonstrated for a case of groundwater modeling with consideration of four alternative models postulated based on different conceptualization of a confining layer. This groundwater example shows that model probabilities estimated using the thermodynamic method are more reasonable than those obtained using the geometric method. The thermodynamic method is general, and can be used for a wide range of environmental problem for model uncertainty quantification.
Wang, Wei; Griswold, Michael E
2016-11-30
The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the 'Average Predicted Value' method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
MIXOR: a computer program for mixed-effects ordinal regression analysis.
Hedeker, D; Gibbons, R D
1996-03-01
MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.
Falk, Carl F; Cai, Li
2016-06-01
We present a semi-parametric approach to estimating item response functions (IRF) useful when the true IRF does not strictly follow commonly used functions. Our approach replaces the linear predictor of the generalized partial credit model with a monotonic polynomial. The model includes the regular generalized partial credit model at the lowest order polynomial. Our approach extends Liang's (A semi-parametric approach to estimate IRFs, Unpublished doctoral dissertation, 2007) method for dichotomous item responses to the case of polytomous data. Furthermore, item parameter estimation is implemented with maximum marginal likelihood using the Bock-Aitkin EM algorithm, thereby facilitating multiple group analyses useful in operational settings. Our approach is demonstrated on both educational and psychological data. We present simulation results comparing our approach to more standard IRF estimation approaches and other non-parametric and semi-parametric alternatives.
Comparing Three Estimation Methods for the Three-Parameter Logistic IRT Model
ERIC Educational Resources Information Center
Lamsal, Sunil
2015-01-01
Different estimation procedures have been developed for the unidimensional three-parameter item response theory (IRT) model. These techniques include the marginal maximum likelihood estimation, the fully Bayesian estimation using Markov chain Monte Carlo simulation techniques, and the Metropolis-Hastings Robbin-Monro estimation. With each…
Schwab, Joshua; Gruber, Susan; Blaser, Nello; Schomaker, Michael; van der Laan, Mark
2015-01-01
This paper describes a targeted maximum likelihood estimator (TMLE) for the parameters of longitudinal static and dynamic marginal structural models. We consider a longitudinal data structure consisting of baseline covariates, time-dependent intervention nodes, intermediate time-dependent covariates, and a possibly time-dependent outcome. The intervention nodes at each time point can include a binary treatment as well as a right-censoring indicator. Given a class of dynamic or static interventions, a marginal structural model is used to model the mean of the intervention-specific counterfactual outcome as a function of the intervention, time point, and possibly a subset of baseline covariates. Because the true shape of this function is rarely known, the marginal structural model is used as a working model. The causal quantity of interest is defined as the projection of the true function onto this working model. Iterated conditional expectation double robust estimators for marginal structural model parameters were previously proposed by Robins (2000, 2002) and Bang and Robins (2005). Here we build on this work and present a pooled TMLE for the parameters of marginal structural working models. We compare this pooled estimator to a stratified TMLE (Schnitzer et al. 2014) that is based on estimating the intervention-specific mean separately for each intervention of interest. The performance of the pooled TMLE is compared to the performance of the stratified TMLE and the performance of inverse probability weighted (IPW) estimators using simulations. Concepts are illustrated using an example in which the aim is to estimate the causal effect of delayed switch following immunological failure of first line antiretroviral therapy among HIV-infected patients. Data from the International Epidemiological Databases to Evaluate AIDS, Southern Africa are analyzed to investigate this question using both TML and IPW estimators. Our results demonstrate practical advantages of the pooled TMLE over an IPW estimator for working marginal structural models for survival, as well as cases in which the pooled TMLE is superior to its stratified counterpart. PMID:25909047
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses
ERIC Educational Resources Information Center
Vasdekis, Vassilis G. S.; Cagnone, Silvia; Moustaki, Irini
2012-01-01
The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate…
Estimation Methods for One-Parameter Testlet Models
ERIC Educational Resources Information Center
Jiao, Hong; Wang, Shudong; He, Wei
2013-01-01
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
ERIC Educational Resources Information Center
Kelderman, Henk
In this paper, algorithms are described for obtaining the maximum likelihood estimates of the parameters in log-linear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual counts in the full contingency table. This is…
High-Dimensional Exploratory Item Factor Analysis by a Metropolis-Hastings Robbins-Monro Algorithm
ERIC Educational Resources Information Center
Cai, Li
2010-01-01
A Metropolis-Hastings Robbins-Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The…
ERIC Educational Resources Information Center
Lee, Soo; Suh, Youngsuk
2018-01-01
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
Item Response Theory with Estimation of the Latent Density Using Davidian Curves
ERIC Educational Resources Information Center
Woods, Carol M.; Lin, Nan
2009-01-01
Davidian-curve item response theory (DC-IRT) is introduced, evaluated with simulations, and illustrated using data from the Schedule for Nonadaptive and Adaptive Personality Entitlement scale. DC-IRT is a method for fitting unidimensional IRT models with maximum marginal likelihood estimation, in which the latent density is estimated,…
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2010-07-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2013-01-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
ERIC Educational Resources Information Center
Yang, Ji Seung; Cai, Li
2014-01-01
The main purpose of this study is to improve estimation efficiency in obtaining maximum marginal likelihood estimates of contextual effects in the framework of nonlinear multilevel latent variable model by adopting the Metropolis-Hastings Robbins-Monro algorithm (MH-RM). Results indicate that the MH-RM algorithm can produce estimates and standard…
ERIC Educational Resources Information Center
Monroe, Scott; Cai, Li
2013-01-01
In Ramsay curve item response theory (RC-IRT, Woods & Thissen, 2006) modeling, the shape of the latent trait distribution is estimated simultaneously with the item parameters. In its original implementation, RC-IRT is estimated via Bock and Aitkin's (1981) EM algorithm, which yields maximum marginal likelihood estimates. This method, however,…
ERIC Educational Resources Information Center
Monroe, Scott; Cai, Li
2014-01-01
In Ramsay curve item response theory (RC-IRT) modeling, the shape of the latent trait distribution is estimated simultaneously with the item parameters. In its original implementation, RC-IRT is estimated via Bock and Aitkin's EM algorithm, which yields maximum marginal likelihood estimates. This method, however, does not produce the…
ERIC Educational Resources Information Center
Martin-Fernandez, Manuel; Revuelta, Javier
2017-01-01
This study compares the performance of two estimation algorithms of new usage, the Metropolis-Hastings Robins-Monro (MHRM) and the Hamiltonian MCMC (HMC), with two consolidated algorithms in the psychometric literature, the marginal likelihood via EM algorithm (MML-EM) and the Markov chain Monte Carlo (MCMC), in the estimation of multidimensional…
Estimating the Parameters of the Beta-Binomial Distribution.
ERIC Educational Resources Information Center
Wilcox, Rand R.
1979-01-01
For some situations the beta-binomial distribution might be used to describe the marginal distribution of test scores for a particular population of examinees. Several different methods of approximating the maximum likelihood estimate were investigated, and it was found that the Newton-Raphson method should be used when it yields admissable…
NASA Astrophysics Data System (ADS)
De Santis, Alberto; Dellepiane, Umberto; Lucidi, Stefano
2012-11-01
In this paper we investigate the estimation problem for a model of the commodity prices. This model is a stochastic state space dynamical model and the problem unknowns are the state variables and the system parameters. Data are represented by the commodity spot prices, very seldom time series of Futures contracts are available for free. Both the system joint likelihood function (state variables and parameters) and the system marginal likelihood (the state variables are eliminated) function are addressed.
Bivariate categorical data analysis using normal linear conditional multinomial probability model.
Sun, Bingrui; Sutradhar, Brajendra
2015-02-10
Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio-based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio-based association model treats the odds ratios involved in the joint probabilities as 'working' parameters, which are consequently estimated through certain arbitrary 'working' regression models. Also, this later odds ratio-based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre-specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi-likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well-known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.
Adaptive Quadrature for Item Response Models. Research Report. ETS RR-06-29
ERIC Educational Resources Information Center
Haberman, Shelby J.
2006-01-01
Adaptive quadrature is applied to marginal maximum likelihood estimation for item response models with normal ability distributions. Even in one dimension, significant gains in speed and accuracy of computation may be achieved.
Marginal and Random Intercepts Models for Longitudinal Binary Data With Examples From Criminology.
Long, Jeffrey D; Loeber, Rolf; Farrington, David P
2009-01-01
Two models for the analysis of longitudinal binary data are discussed: the marginal model and the random intercepts model. In contrast to the linear mixed model (LMM), the two models for binary data are not subsumed under a single hierarchical model. The marginal model provides group-level information whereas the random intercepts model provides individual-level information including information about heterogeneity of growth. It is shown how a type of numerical averaging can be used with the random intercepts model to obtain group-level information, thus approximating individual and marginal aspects of the LMM. The types of inferences associated with each model are illustrated with longitudinal criminal offending data based on N = 506 males followed over a 22-year period. Violent offending indexed by official records and self-report were analyzed, with the marginal model estimated using generalized estimating equations and the random intercepts model estimated using maximum likelihood. The results show that the numerical averaging based on the random intercepts can produce prediction curves almost identical to those obtained directly from the marginal model parameter estimates. The results provide a basis for contrasting the models and the estimation procedures and key features are discussed to aid in selecting a method for empirical analysis.
Ramsay-Curve Item Response Theory for the Three-Parameter Logistic Item Response Model
ERIC Educational Resources Information Center
Woods, Carol M.
2008-01-01
In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…
Markov Chain Monte Carlo Estimation of Item Parameters for the Generalized Graded Unfolding Model
ERIC Educational Resources Information Center
de la Torre, Jimmy; Stark, Stephen; Chernyshenko, Oleksandr S.
2006-01-01
The authors present a Markov Chain Monte Carlo (MCMC) parameter estimation procedure for the generalized graded unfolding model (GGUM) and compare it to the marginal maximum likelihood (MML) approach implemented in the GGUM2000 computer program, using simulated and real personality data. In the simulation study, test length, number of response…
Cosmological parameters from a re-analysis of the WMAP 7 year low-resolution maps
NASA Astrophysics Data System (ADS)
Finelli, F.; De Rosa, A.; Gruppuso, A.; Paoletti, D.
2013-06-01
Cosmological parameters from Wilkinson Microwave Anisotropy Probe (WMAP) 7 year data are re-analysed by substituting a pixel-based likelihood estimator to the one delivered publicly by the WMAP team. Our pixel-based estimator handles exactly intensity and polarization in a joint manner, allowing us to use low-resolution maps and noise covariance matrices in T, Q, U at the same resolution, which in this work is 3.6°. We describe the features and the performances of the code implementing our pixel-based likelihood estimator. We perform a battery of tests on the application of our pixel-based likelihood routine to WMAP publicly available low-resolution foreground-cleaned products, in combination with the WMAP high-ℓ likelihood, reporting the differences on cosmological parameters evaluated by the full WMAP likelihood public package. The differences are not only due to the treatment of polarization, but also to the marginalization over monopole and dipole uncertainties present in the WMAP pixel likelihood code for temperature. The credible central value for the cosmological parameters change below the 1σ level with respect to the evaluation by the full WMAP 7 year likelihood code, with the largest difference in a shift to smaller values of the scalar spectral index nS.
Maximum likelihood solution for inclination-only data in paleomagnetism
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2010-08-01
We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
Generalized Full-Information Item Bifactor Analysis
Cai, Li; Yang, Ji Seung; Hansen, Mark
2011-01-01
Full-information item bifactor analysis is an important statistical method in psychological and educational measurement. Current methods are limited to single group analysis and inflexible in the types of item response models supported. We propose a flexible multiple-group item bifactor analysis framework that supports a variety of multidimensional item response theory models for an arbitrary mixing of dichotomous, ordinal, and nominal items. The extended item bifactor model also enables the estimation of latent variable means and variances when data from more than one group are present. Generalized user-defined parameter restrictions are permitted within or across groups. We derive an efficient full-information maximum marginal likelihood estimator. Our estimation method achieves substantial computational savings by extending Gibbons and Hedeker’s (1992) bifactor dimension reduction method so that the optimization of the marginal log-likelihood only requires two-dimensional integration regardless of the dimensionality of the latent variables. We use simulation studies to demonstrate the flexibility and accuracy of the proposed methods. We apply the model to study cross-country differences, including differential item functioning, using data from a large international education survey on mathematics literacy. PMID:21534682
Marginal Structural Models with Counterfactual Effect Modifiers.
Zheng, Wenjing; Luo, Zhehui; van der Laan, Mark J
2018-06-08
In health and social sciences, research questions often involve systematic assessment of the modification of treatment causal effect by patient characteristics. In longitudinal settings, time-varying or post-intervention effect modifiers are also of interest. In this work, we investigate the robust and efficient estimation of the Counterfactual-History-Adjusted Marginal Structural Model (van der Laan MJ, Petersen M. Statistical learning of origin-specific statically optimal individualized treatment rules. Int J Biostat. 2007;3), which models the conditional intervention-specific mean outcome given a counterfactual modifier history in an ideal experiment. We establish the semiparametric efficiency theory for these models, and present a substitution-based, semiparametric efficient and doubly robust estimator using the targeted maximum likelihood estimation methodology (TMLE, e.g. van der Laan MJ, Rubin DB. Targeted maximum likelihood learning. Int J Biostat. 2006;2, van der Laan MJ, Rose S. Targeted learning: causal inference for observational and experimental data, 1st ed. Springer Series in Statistics. Springer, 2011). To facilitate implementation in applications where the effect modifier is high dimensional, our third contribution is a projected influence function (and the corresponding projected TMLE estimator), which retains most of the robustness of its efficient peer and can be easily implemented in applications where the use of the efficient influence function becomes taxing. We compare the projected TMLE estimator with an Inverse Probability of Treatment Weighted estimator (e.g. Robins JM. Marginal structural models. In: Proceedings of the American Statistical Association. Section on Bayesian Statistical Science, 1-10. 1997a, Hernan MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. 2000;11:561-570), and a non-targeted G-computation estimator (Robins JM. A new approach to causal inference in mortality studies with sustained exposure periods - application to control of the healthy worker survivor effect. Math Modell. 1986;7:1393-1512.). The comparative performance of these estimators is assessed in a simulation study. The use of the projected TMLE estimator is illustrated in a secondary data analysis for the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial where effect modifiers are subject to missing at random.
A quantum framework for likelihood ratios
NASA Astrophysics Data System (ADS)
Bond, Rachael L.; He, Yang-Hui; Ormerod, Thomas C.
The ability to calculate precise likelihood ratios is fundamental to science, from Quantum Information Theory through to Quantum State Estimation. However, there is no assumption-free statistical methodology to achieve this. For instance, in the absence of data relating to covariate overlap, the widely used Bayes’ theorem either defaults to the marginal probability driven “naive Bayes’ classifier”, or requires the use of compensatory expectation-maximization techniques. This paper takes an information-theoretic approach in developing a new statistical formula for the calculation of likelihood ratios based on the principles of quantum entanglement, and demonstrates that Bayes’ theorem is a special case of a more general quantum mechanical expression.
ERIC Educational Resources Information Center
Prevost, A. Toby; Mason, Dan; Griffin, Simon; Kinmonth, Ann-Louise; Sutton, Stephen; Spiegelhalter, David
2007-01-01
Practical meta-analysis of correlation matrices generally ignores covariances (and hence correlations) between correlation estimates. The authors consider various methods for allowing for covariances, including generalized least squares, maximum marginal likelihood, and Bayesian approaches, illustrated using a 6-dimensional response in a series of…
Cross-validation to select Bayesian hierarchical models in phylogenetics.
Duchêne, Sebastián; Duchêne, David A; Di Giallonardo, Francesca; Eden, John-Sebastian; Geoghegan, Jemma L; Holt, Kathryn E; Ho, Simon Y W; Holmes, Edward C
2016-05-26
Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clock models. Cross-validation is a useful method for Bayesian phylogenetic model selection. This method can be readily implemented even when considering complex models where selecting an appropriate prior for all parameters may be difficult.
Rosenblum, Michael; van der Laan, Mark J.
2010-01-01
Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
Essays in the California electricity reserves markets
NASA Astrophysics Data System (ADS)
Metaxoglou, Konstantinos
This dissertation examines inefficiencies in the California electricity reserves markets. In Chapter 1, I use the information released during the investigation of the state's electricity crisis of 2000 and 2001 by the Federal Energy Regulatory Commission to diagnose allocative inefficiencies. Building upon the work of Wolak (2000), I calculate a lower bound for the sellers' price-cost margins using the inverse elasticities of their residual demand curves. The downward bias in my estimates stems from the fact that I don't account for the hierarchical substitutability of the reserve types. The margins averaged at least 20 percent for the two highest quality types of reserves, regulation and spinning, generating millions of dollars in transfers to a handful of sellers. I provide evidence that the deviations from marginal cost pricing were due to the markets' high concentration and a principal-agent relationship that emerged from their design. In Chapter 2, I document systematic differences between the markets' day- and hour-ahead prices. I use a high-dimensional vector moving average model to estimate the premia and conduct correct inferences. To obtain exact maximum likelihood estimates of the model, I employ the EM algorithm that I develop in Chapter 3. I uncover significant day-ahead premia, which I attribute to market design characteristics too. On the demand side, the market design established a principal-agent relationship between the markets' buyers (principal) and their supervisory authority (agent). The agent had very limited incentives to shift reserve purchases to the lower priced hour-ahead markets. On the supply side, the market design raised substantial entry barriers by precluding purely speculative trading and by introducing a complicated code of conduct that induced uncertainty about which actions were subject to regulatory scrutiny. In Chapter 3, I introduce a state-space representation for vector autoregressive moving average models that enables exact maximum likelihood estimation using the EM algorithm. Moreover, my algorithm uses only analytical expressions; it requires the Kalman filter and a fixed-interval smoother in the E step and least squares-type regression in the M step. In contrast, existing maximum likelihood estimation methods require numerical differentiation, both for univariate and multivariate models.
Semiparametric Item Response Functions in the Context of Guessing
ERIC Educational Resources Information Center
Falk, Carl F.; Cai, Li
2016-01-01
We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood-based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…
A survey of kernel-type estimators for copula and their applications
NASA Astrophysics Data System (ADS)
Sumarjaya, I. W.
2017-10-01
Copulas have been widely used to model nonlinear dependence structure. Main applications of copulas include areas such as finance, insurance, hydrology, rainfall to name but a few. The flexibility of copula allows researchers to model dependence structure beyond Gaussian distribution. Basically, a copula is a function that couples multivariate distribution functions to their one-dimensional marginal distribution functions. In general, there are three methods to estimate copula. These are parametric, nonparametric, and semiparametric method. In this article we survey kernel-type estimators for copula such as mirror reflection kernel, beta kernel, transformation method and local likelihood transformation method. Then, we apply these kernel methods to three stock indexes in Asia. The results of our analysis suggest that, albeit variation in information criterion values, the local likelihood transformation method performs better than the other kernel methods.
Semi-Parametric Item Response Functions in the Context of Guessing. CRESST Report 844
ERIC Educational Resources Information Center
Falk, Carl F.; Cai, Li
2015-01-01
We present a logistic function of a monotonic polynomial with a lower asymptote, allowing additional flexibility beyond the three-parameter logistic model. We develop a maximum marginal likelihood based approach to estimate the item parameters. The new item response model is demonstrated on math assessment data from a state, and a computationally…
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine
Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold
1992-01-01
A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
Semiparametric time-to-event modeling in the presence of a latent progression event.
Rice, John D; Tsodikov, Alex
2017-06-01
In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood-based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. © 2016, The International Biometric Society.
Tests of Independence in Contingency Tables with Small Samples: A Comparison of Statistical Power.
ERIC Educational Resources Information Center
Parshall, Cynthia G.; Kromrey, Jeffrey D.
1996-01-01
Power and Type I error rates were estimated for contingency tables with small sample sizes for the following four types of tests: (1) Pearson's chi-square; (2) chi-square with Yates's continuity correction; (3) the likelihood ratio test; and (4) Fisher's Exact Test. Various marginal distributions, sample sizes, and effect sizes were examined. (SLD)
Marginalized zero-inflated negative binomial regression with application to dental caries
Preisser, John S.; Das, Kalyan; Long, D. Leann; Divaris, Kimon
2015-01-01
The zero-inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non-susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well-suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero-inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared to marginalized zero-inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school-based fluoride mouthrinse program on dental caries in 677 children. PMID:26568034
Valuing river characteristics using combined site choice and participation travel cost models.
Johnstone, C; Markandya, A
2006-08-01
This paper presents new welfare measures for marginal changes in river quality in selected English rivers. The river quality indicators used include chemical, biological and habitat-level attributes. Economic values for recreational use of three types of river-upland, lowland and chalk-are presented. A survey of anglers was carried out and using these data, two travel cost models were estimated, one to predict the numbers of trips and the other to predict angling site choice. These models were then linked to estimate the welfare associated with marginal changes in river quality using the participation levels as estimated in the trip prediction model. The model results showed that higher flow rates, biological quality and nutrient pollution levels affect site choice and influence the likelihood of a fishing trip. Consumer surplus values per trip for a 10% change in river attributes range from pound 0.04 to pound 3.93 ( pound 2001) depending on the attribute.
COSMIC MICROWAVE BACKGROUND LIKELIHOOD APPROXIMATION FOR BANDED PROBABILITY DISTRIBUTIONS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gjerløw, E.; Mikkelsen, K.; Eriksen, H. K.
We investigate sets of random variables that can be arranged sequentially such that a given variable only depends conditionally on its immediate predecessor. For such sets, we show that the full joint probability distribution may be expressed exclusively in terms of uni- and bivariate marginals. Under the assumption that the cosmic microwave background (CMB) power spectrum likelihood only exhibits correlations within a banded multipole range, Δl{sub C}, we apply this expression to two outstanding problems in CMB likelihood analysis. First, we derive a statistically well-defined hybrid likelihood estimator, merging two independent (e.g., low- and high-l) likelihoods into a single expressionmore » that properly accounts for correlations between the two. Applying this expression to the Wilkinson Microwave Anisotropy Probe (WMAP) likelihood, we verify that the effect of correlations on cosmological parameters in the transition region is negligible in terms of cosmological parameters for WMAP; the largest relative shift seen for any parameter is 0.06σ. However, because this may not hold for other experimental setups (e.g., for different instrumental noise properties or analysis masks), but must rather be verified on a case-by-case basis, we recommend our new hybridization scheme for future experiments for statistical self-consistency reasons. Second, we use the same expression to improve the convergence rate of the Blackwell-Rao likelihood estimator, reducing the required number of Monte Carlo samples by several orders of magnitude, and thereby extend it to high-l applications.« less
Exact likelihood evaluations and foreground marginalization in low resolution WMAP data
NASA Astrophysics Data System (ADS)
Slosar, Anže; Seljak, Uroš; Makarov, Alexey
2004-06-01
The large scale anisotropies of Wilkinson Microwave Anisotropy Probe (WMAP) data have attracted a lot of attention and have been a source of controversy, with many favorite cosmological models being apparently disfavored by the power spectrum estimates at low l. All the existing analyses of theoretical models are based on approximations for the likelihood function, which are likely to be inaccurate on large scales. Here we present exact evaluations of the likelihood of the low multipoles by direct inversion of the theoretical covariance matrix for low resolution WMAP maps. We project out the unwanted galactic contaminants using the WMAP derived maps of these foregrounds. This improves over the template based foreground subtraction used in the original analysis, which can remove some of the cosmological signal and may lead to a suppression of power. As a result we find an increase in power at low multipoles. For the quadrupole the maximum likelihood values are rather uncertain and vary between 140 and 220 μK2. On the other hand, the probability distribution away from the peak is robust and, assuming a uniform prior between 0 and 2000 μK2, the probability of having the true value above 1200 μK2 (as predicted by the simplest cold dark matter model with a cosmological constant) is 10%, a factor of 2.5 higher than predicted by the WMAP likelihood code. We do not find the correlation function to be unusual beyond the low quadrupole value. We develop a fast likelihood evaluation routine that can be used instead of WMAP routines for low l values. We apply it to the Markov chain Monte Carlo analysis to compare the cosmological parameters between the two cases. The new analysis of WMAP either alone or jointly with the Sloan Digital Sky Survey (SDSS) and the Very Small Array (VSA) data reduces the evidence for running to less than 1σ, giving αs=-0.022±0.033 for the combined case. The new analysis prefers about a 1σ lower value of Ωm, a consequence of an increased integrated Sachs-Wolfe (ISW) effect contribution required by the increase in the spectrum at low l. These results suggest that the details of foreground removal and full likelihood analysis are important for parameter estimation from the WMAP data. They are robust in the sense that they do not change significantly with frequency, mask, or details of foreground template marginalization. The marginalization approach presented here is the most conservative method to remove the foregrounds and should be particularly useful in the analysis of polarization, where foreground contamination may be much more severe.
A Variance Distribution Model of Surface EMG Signals Based on Inverse Gamma Distribution.
Hayashi, Hideaki; Furui, Akira; Kurita, Yuichi; Tsuji, Toshio
2017-11-01
Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force. Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force.
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
A baseline-free procedure for transformation models under interval censorship.
Gu, Ming Gao; Sun, Liuquan; Zuo, Guoxin
2005-12-01
An important property of Cox regression model is that the estimation of regression parameters using the partial likelihood procedure does not depend on its baseline survival function. We call such a procedure baseline-free. Using marginal likelihood, we show that an baseline-free procedure can be derived for a class of general transformation models under interval censoring framework. The baseline-free procedure results a simplified and stable computation algorithm for some complicated and important semiparametric models, such as frailty models and heteroscedastic hazard/rank regression models, where the estimation procedures so far available involve estimation of the infinite dimensional baseline function. A detailed computational algorithm using Markov Chain Monte Carlo stochastic approximation is presented. The proposed procedure is demonstrated through extensive simulation studies, showing the validity of asymptotic consistency and normality. We also illustrate the procedure with a real data set from a study of breast cancer. A heuristic argument showing that the score function is a mean zero martingale is provided.
A Maximum Likelihood Approach to Functional Mapping of Longitudinal Binary Traits
Wang, Chenguang; Li, Hongying; Wang, Zhong; Wang, Yaqun; Wang, Ningtao; Wang, Zuoheng; Wu, Rongling
2013-01-01
Despite their importance in biology and biomedicine, genetic mapping of binary traits that change over time has not been well explored. In this article, we develop a statistical model for mapping quantitative trait loci (QTLs) that govern longitudinal responses of binary traits. The model is constructed within the maximum likelihood framework by which the association between binary responses is modeled in terms of conditional log odds-ratios. With this parameterization, the maximum likelihood estimates (MLEs) of marginal mean parameters are robust to the misspecification of time dependence. We implement an iterative procedures to obtain the MLEs of QTL genotype-specific parameters that define longitudinal binary responses. The usefulness of the model was validated by analyzing a real example in rice. Simulation studies were performed to investigate the statistical properties of the model, showing that the model has power to identify and map specific QTLs responsible for the temporal pattern of binary traits. PMID:23183762
Using known map category marginal frequencies to improve estimates of thematic map accuracy
NASA Technical Reports Server (NTRS)
Card, D. H.
1982-01-01
By means of two simple sampling plans suggested in the accuracy-assessment literature, it is shown how one can use knowledge of map-category relative sizes to improve estimates of various probabilities. The fact that maximum likelihood estimates of cell probabilities for the simple random sampling and map category-stratified sampling were identical has permitted a unified treatment of the contingency-table analysis. A rigorous analysis of the effect of sampling independently within map categories is made possible by results for the stratified case. It is noted that such matters as optimal sample size selection for the achievement of a desired level of precision in various estimators are irrelevant, since the estimators derived are valid irrespective of how sample sizes are chosen.
NASA Astrophysics Data System (ADS)
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2015-12-01
Models in biogeoscience involve uncertainties in observation data, model inputs, model structure, model processes and modeling scenarios. To accommodate for different sources of uncertainty, multimodal analysis such as model combination, model selection, model elimination or model discrimination are becoming more popular. To illustrate theoretical and practical challenges of multimodal analysis, we use an example about microbial soil respiration modeling. Global soil respiration releases more than ten times more carbon dioxide to the atmosphere than all anthropogenic emissions. Thus, improving our understanding of microbial soil respiration is essential for improving climate change models. This study focuses on a poorly understood phenomena, which is the soil microbial respiration pulses in response to episodic rainfall pulses (the "Birch effect"). We hypothesize that the "Birch effect" is generated by the following three mechanisms. To test our hypothesis, we developed and assessed five evolving microbial-enzyme models against field measurements from a semiarid Savannah that is characterized by pulsed precipitation. These five model evolve step-wise such that the first model includes none of these three mechanism, while the fifth model includes the three mechanisms. The basic component of Bayesian multimodal analysis is the estimation of marginal likelihood to rank the candidate models based on their overall likelihood with respect to observation data. The first part of the study focuses on using this Bayesian scheme to discriminate between these five candidate models. The second part discusses some theoretical and practical challenges, which are mainly the effect of likelihood function selection and the marginal likelihood estimation methods on both model ranking and Bayesian model averaging. The study shows that making valid inference from scientific data is not a trivial task, since we are not only uncertain about the candidate scientific models, but also about the statistical methods that are used to discriminate between these models.
Tissue preservation with mass spectroscopic analysis: Implications for cancer diagnostics.
Hall, O Morgan; Peer, Cody J; Figg, William D
2018-05-17
Surgical intervention is a common treatment modality for localized cancer. Post-operative analysis involves evaluation of surgical margins to assess whether all malignant tissue has been resected because positive surgical margins lead to a greater likelihood of recurrence. Secondary treatments are utilized to minimize the negative effects of positive surgical margins. Recently, in Science Translational Medicine, Zhang et al describe a new mass spectroscopic technique that could potentially decrease the likelihood of positive surgical margins. Their nondestructive in vivo tissue sampling leads to a highly accurate and rapid cancer diagnosis with great precision between healthy and malignant tissue. This new tool has the potential to improve surgical margins and accelerate cancer diagnostics by analyzing biomolecular signatures of various tissues and diseases.
The Atacama Cosmology Telescope: Likelihood for Small-Scale CMB Data
NASA Technical Reports Server (NTRS)
Dunkley, J.; Calabrese, E.; Sievers, J.; Addison, G. E.; Battaglia, N.; Battistelli, E. S.; Bond, J. R.; Das, S.; Devlin, M. J.; Dunner, R.;
2013-01-01
The Atacama Cosmology Telescope has measured the angular power spectra of microwave fluctuations to arcminute scales at frequencies of 148 and 218 GHz, from three seasons of data. At small scales the fluctuations in the primordial Cosmic Microwave Background (CMB) become increasingly obscured by extragalactic foregounds and secondary CMB signals. We present results from a nine-parameter model describing these secondary effects, including the thermal and kinematic Sunyaev-Zel'dovich (tSZ and kSZ) power; the clustered and Poisson-like power from Cosmic Infrared Background (CIB) sources, and their frequency scaling; the tSZ-CIB correlation coefficient; the extragalactic radio source power; and thermal dust emission from Galactic cirrus in two different regions of the sky. In order to extract cosmological parameters, we describe a likelihood function for the ACT data, fitting this model to the multi-frequency spectra in the multipole range 500 < l < 10000. We extend the likelihood to include spectra from the South Pole Telescope at frequencies of 95, 150, and 220 GHz. Accounting for different radio source levels and Galactic cirrus emission, the same model provides an excellent fit to both datasets simultaneously, with ?2/dof= 675/697 for ACT, and 96/107 for SPT. We then use the multi-frequency likelihood to estimate the CMB power spectrum from ACT in bandpowers, marginalizing over the secondary parameters. This provides a simplified 'CMB-only' likelihood in the range 500 < l < 3500 for use in cosmological parameter estimation
2009-01-01
Background Marginal posterior genotype probabilities need to be computed for genetic analyses such as geneticcounseling in humans and selective breeding in animal and plant species. Methods In this paper, we describe a peeling based, deterministic, exact algorithm to compute efficiently genotype probabilities for every member of a pedigree with loops without recourse to junction-tree methods from graph theory. The efficiency in computing the likelihood by peeling comes from storing intermediate results in multidimensional tables called cutsets. Computing marginal genotype probabilities for individual i requires recomputing the likelihood for each of the possible genotypes of individual i. This can be done efficiently by storing intermediate results in two types of cutsets called anterior and posterior cutsets and reusing these intermediate results to compute the likelihood. Examples A small example is used to illustrate the theoretical concepts discussed in this paper, and marginal genotype probabilities are computed at a monogenic disease locus for every member in a real cattle pedigree. PMID:19958551
Mackey, Dawn C.; Hubbard, Alan E.; Cawthon, Peggy M.; Cauley, Jane A.; Cummings, Steven R.; Tager, Ira B.
2011-01-01
Few studies have examined the relation between usual physical activity level and rate of hip fracture in older men or applied semiparametric methods from the causal inference literature that estimate associations without assuming a particular parametric model. Using the Physical Activity Scale for the Elderly, the authors measured usual physical activity level at baseline (2000–2002) in 5,682 US men ≥65 years of age who were enrolled in the Osteoporotic Fractures in Men Study. Physical activity levels were classified as low (bottom quartile of Physical Activity Scale for the Elderly score), moderate (middle quartiles), or high (top quartile). Hip fractures were confirmed by central review. Marginal associations between physical activity and hip fracture were estimated with 3 estimation methods: inverse probability-of-treatment weighting, G-computation, and doubly robust targeted maximum likelihood estimation. During 6.5 years of follow-up, 95 men (1.7%) experienced a hip fracture. The unadjusted risk of hip fracture was lower in men with a high physical activity level versus those with a low physical activity level (relative risk = 0.51, 95% confidence interval: 0.28, 0.92). In semiparametric analyses that controlled confounding, hip fracture risk was not lower with moderate (e.g., targeted maximum likelihood estimation relative risk = 0.92, 95% confidence interval: 0.62, 1.44) or high (e.g., targeted maximum likelihood estimation relative risk = 0.88, 95% confidence interval: 0.53, 2.03) physical activity relative to low. This study does not support a protective effect of usual physical activity on hip fracture in older men. PMID:21303805
An 'unconditional-like' structure for the conditional estimator of odds ratio from 2 x 2 tables.
Hanley, James A; Miettinen, Olli S
2006-02-01
In the estimation of the odds ratio (OR), the conditional maximum-likelihood estimate (cMLE) is preferred to the more readily computed unconditional one (uMLE). However, the exact cMLE does not have a closed form to help divine it from the uMLE or to understand in what circumstances the difference between the two is appreciable. Here, the cMLE is shown to have the same 'ratio of cross-products' structure as its unconditional counterpart, but with two of the cell frequencies augmented, so as to shrink the unconditional estimator towards unity. The augmentation involves a factor, similar to the finite population correction, derived from the minimum of the marginal totals.
The Maximum Likelihood Solution for Inclination-only Data
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2006-12-01
The arithmetic means of inclination-only data are known to introduce a shallowing bias. Several methods have been proposed to estimate unbiased means of the inclination along with measures of the precision. Most of the inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all these methods require various assumptions and approximations that are inappropriate for many data sets. For some steep and dispersed data sets, the estimates provided by these methods are significantly displaced from the peak of the likelihood function to systematically shallower inclinations. The problem in locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest. This is because some elements of the log-likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study we succeeded in analytically cancelling exponential elements from the likelihood function, and we are now able to calculate its value for any location in the parameter space and for any inclination-only data set, with full accuracy. Furtermore, we can now calculate the partial derivatives of the likelihood function with desired accuracy. Locating the maximum likelihood without the assumptions required by previous methods is now straight forward. The information to separate the mean inclination from the precision parameter will be lost for very steep and dispersed data sets. It is worth noting that the likelihood function always has a maximum value. However, for some dispersed and steep data sets with few samples, the likelihood function takes its highest value on the boundary of the parameter space, i.e. at inclinations of +/- 90 degrees, but with relatively well defined dispersion. Our simulations indicate that this occurs quite frequently for certain data sets, and relatively small perturbations in the data will drive the maxima to the boundary. We interpret this to indicate that, for such data sets, the information needed to separate the mean inclination and the precision parameter is permanently lost. To assess the reliability and accuracy of our method we generated large number of random Fisher-distributed data sets and used seven methods to estimate the mean inclination and precision paramenter. These comparisons are described by Levi and Arason at the 2006 AGU Fall meeting. The results of the various methods is very favourable to our new robust maximum likelihood method, which, on average, is the most reliable, and the mean inclination estimates are the least biased toward shallow values. Further information on our inclination-only analysis can be obtained from: http://www.vedur.is/~arason/paleomag
Gaussianization for fast and accurate inference from cosmological data
NASA Astrophysics Data System (ADS)
Schuhmann, Robert L.; Joachimi, Benjamin; Peiris, Hiranya V.
2016-06-01
We present a method to transform multivariate unimodal non-Gaussian posterior probability densities into approximately Gaussian ones via non-linear mappings, such as Box-Cox transformations and generalizations thereof. This permits an analytical reconstruction of the posterior from a point sample, like a Markov chain, and simplifies the subsequent joint analysis with other experiments. This way, a multivariate posterior density can be reported efficiently, by compressing the information contained in Markov Chain Monte Carlo samples. Further, the model evidence integral (I.e. the marginal likelihood) can be computed analytically. This method is analogous to the search for normal parameters in the cosmic microwave background, but is more general. The search for the optimally Gaussianizing transformation is performed computationally through a maximum-likelihood formalism; its quality can be judged by how well the credible regions of the posterior are reproduced. We demonstrate that our method outperforms kernel density estimates in this objective. Further, we select marginal posterior samples from Planck data with several distinct strongly non-Gaussian features, and verify the reproduction of the marginal contours. To demonstrate evidence computation, we Gaussianize the joint distribution of data from weak lensing and baryon acoustic oscillations, for different cosmological models, and find a preference for flat Λcold dark matter. Comparing to values computed with the Savage-Dickey density ratio, and Population Monte Carlo, we find good agreement of our method within the spread of the other two.
Candel, Math J J M; Van Breukelen, Gerard J P
2010-06-30
Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.
Semiparametric Time-to-Event Modeling in the Presence of a Latent Progression Event
Rice, John D.; Tsodikov, Alex
2017-01-01
Summary In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood–based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. PMID:27556886
Efficient, adaptive estimation of two-dimensional firing rate surfaces via Gaussian process methods.
Rad, Kamiar Rahnama; Paninski, Liam
2010-01-01
Estimating two-dimensional firing rate maps is a common problem, arising in a number of contexts: the estimation of place fields in hippocampus, the analysis of temporally nonstationary tuning curves in sensory and motor areas, the estimation of firing rates following spike-triggered covariance analyses, etc. Here we introduce methods based on Gaussian process nonparametric Bayesian techniques for estimating these two-dimensional rate maps. These techniques offer a number of advantages: the estimates may be computed efficiently, come equipped with natural errorbars, adapt their smoothness automatically to the local density and informativeness of the observed data, and permit direct fitting of the model hyperparameters (e.g., the prior smoothness of the rate map) via maximum marginal likelihood. We illustrate the method's flexibility and performance on a variety of simulated and real data.
Learning multivariate distributions by competitive assembly of marginals.
Sánchez-Vega, Francisco; Younes, Laurent; Geman, Donald
2013-02-01
We present a new framework for learning high-dimensional multivariate probability distributions from estimated marginals. The approach is motivated by compositional models and Bayesian networks, and designed to adapt to small sample sizes. We start with a large, overlapping set of elementary statistical building blocks, or "primitives," which are low-dimensional marginal distributions learned from data. Each variable may appear in many primitives. Subsets of primitives are combined in a Lego-like fashion to construct a probabilistic graphical model; only a small fraction of the primitives will participate in any valid construction. Since primitives can be precomputed, parameter estimation and structure search are separated. Model complexity is controlled by strong biases; we adapt the primitives to the amount of training data and impose rules which restrict the merging of them into allowable compositions. The likelihood of the data decomposes into a sum of local gains, one for each primitive in the final structure. We focus on a specific subclass of networks which are binary forests. Structure optimization corresponds to an integer linear program and the maximizing composition can be computed for reasonably large numbers of variables. Performance is evaluated using both synthetic data and real datasets from natural language processing and computational biology.
Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks
2016-01-01
Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330
Likelihood of Tree Topologies with Fossils and Diversification Rate Estimation.
Didier, Gilles; Fau, Marine; Laurin, Michel
2017-11-01
Since the diversification process cannot be directly observed at the human scale, it has to be studied from the information available, namely the extant taxa and the fossil record. In this sense, phylogenetic trees including both extant taxa and fossils are the most complete representations of the diversification process that one can get. Such phylogenetic trees can be reconstructed from molecular and morphological data, to some extent. Among the temporal information of such phylogenetic trees, fossil ages are by far the most precisely known (divergence times are inferences calibrated mostly with fossils). We propose here a method to compute the likelihood of a phylogenetic tree with fossils in which the only considered time information is the fossil ages, and apply it to the estimation of the diversification rates from such data. Since it is required in our computation, we provide a method for determining the probability of a tree topology under the standard diversification model. Testing our approach on simulated data shows that the maximum likelihood rate estimates from the phylogenetic tree topology and the fossil dates are almost as accurate as those obtained by taking into account all the data, including the divergence times. Moreover, they are substantially more accurate than the estimates obtained only from the exact divergence times (without taking into account the fossil record). We also provide an empirical example composed of 50 Permo-Carboniferous eupelycosaur (early synapsid) taxa ranging in age from about 315 Ma (Late Carboniferous) to 270 Ma (shortly after the end of the Early Permian). Our analyses suggest a speciation (cladogenesis, or birth) rate of about 0.1 per lineage and per myr, a marginally lower extinction rate, and a considerable hidden paleobiodiversity of early synapsids. [Extinction rate; fossil ages; maximum likelihood estimation; speciation rate.]. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Efficient Maximum Likelihood Estimation for Pedigree Data with the Sum-Product Algorithm.
Engelhardt, Alexander; Rieger, Anna; Tresch, Achim; Mansmann, Ulrich
2016-01-01
We analyze data sets consisting of pedigrees with age at onset of colorectal cancer (CRC) as phenotype. The occurrence of familial clusters of CRC suggests the existence of a latent, inheritable risk factor. We aimed to compute the probability of a family possessing this risk factor as well as the hazard rate increase for these risk factor carriers. Due to the inheritability of this risk factor, the estimation necessitates a costly marginalization of the likelihood. We propose an improved EM algorithm by applying factor graphs and the sum-product algorithm in the E-step. This reduces the computational complexity from exponential to linear in the number of family members. Our algorithm is as precise as a direct likelihood maximization in a simulation study and a real family study on CRC risk. For 250 simulated families of size 19 and 21, the runtime of our algorithm is faster by a factor of 4 and 29, respectively. On the largest family (23 members) in the real data, our algorithm is 6 times faster. We introduce a flexible and runtime-efficient tool for statistical inference in biomedical event data with latent variables that opens the door for advanced analyses of pedigree data. © 2017 S. Karger AG, Basel.
Information matrix estimation procedures for cognitive diagnostic models.
Liu, Yanlou; Xin, Tao; Andersson, Björn; Tian, Wei
2018-03-06
Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and structural parameters. The relationships between the observed information matrix, the empirical cross-product information matrix, the sandwich-type covariance matrix and the two approaches proposed by de la Torre (2009, J. Educ. Behav. Stat., 34, 115) are discussed. Simulation results show that, for a correctly specified CDM and Q-matrix or with a slightly misspecified probability model, the observed information matrix and the sandwich-type covariance matrix exhibit good performance with respect to providing consistent standard errors of item parameter estimates. However, with substantial model misspecification only the sandwich-type covariance matrix exhibits robust performance. © 2018 The British Psychological Society.
NASA Astrophysics Data System (ADS)
Gaál, Ladislav; Szolgay, Ján.; Bacigál, Tomáå.¡; Kohnová, Silvia
2010-05-01
Copula-based estimation methods of hydro-climatological extremes have increasingly been gaining attention of researchers and practitioners in the last couple of years. Unlike the traditional estimation methods which are based on bivariate cumulative distribution functions (CDFs), copulas are a relatively flexible tool of statistics that allow for modelling dependencies between two or more variables such as flood peaks and flood volumes without making strict assumptions on the marginal distributions. The dependence structure and the reliability of the joint estimates of hydro-climatological extremes, mainly in the right tail of the joint CDF not only depends on the particular copula adopted but also on the data available for the estimation of the marginal distributions of the individual variables. Generally, data samples for frequency modelling have limited temporal extent, which is a considerable drawback of frequency analyses in practice. Therefore, it is advised to deal with statistical methods that improve any part of the process of copula construction and result in more reliable design values of hydrological variables. The scarcity of the data sample mostly in the extreme tail of the joint CDF can be bypassed, e.g., by using a considerably larger amount of simulated data by rainfall-runoff analysis or by including historical information on the variables under study. The latter approach of data extension is used here to make the quantile estimates of the individual marginals of the copula more reliable. In the presented paper it is proposed to use historical information in the frequency analysis of the marginal distributions in the framework of Bayesian Monte Carlo Markov Chain (MCMC) simulations. Generally, a Bayesian approach allows for a straightforward combination of different sources of information on floods (e.g. flood data from systematic measurements and historical flood records, respectively) in terms of a product of the corresponding likelihood functions. On the other hand, the MCMC algorithm is a numerical approach for sampling from the likelihood distributions. The Bayesian MCMC methods therefore provide an attractive way to estimate the uncertainty in parameters and quantile metrics of frequency distributions. The applicability of the method is demonstrated in a case study of the hydroelectric power station Orlík on the Vltava River. This site has a key role in the flood prevention of Prague, the capital city of the Czech Republic. The record length of the available flood data is 126 years from the period 1877-2002, while the flood event observed in 2002 that caused extensive damages and numerous casualties is treated as a historic one. To estimate the joint probabilities of flood peaks and volumes, different copulas are fitted and their goodness-of-fit are evaluated by bootstrap simulations. Finally, selected quantiles of flood volumes conditioned on given flood peaks are derived and compared with those obtained by the traditional method used in the practice of water management specialists of the Vltava River.
On the validity of cosmological Fisher matrix forecasts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolz, Laura; Kilbinger, Martin; Weller, Jochen
2012-09-01
We present a comparison of Fisher matrix forecasts for cosmological probes with Monte Carlo Markov Chain (MCMC) posterior likelihood estimation methods. We analyse the performance of future Dark Energy Task Force (DETF) stage-III and stage-IV dark-energy surveys using supernovae, baryon acoustic oscillations and weak lensing as probes. We concentrate in particular on the dark-energy equation of state parameters w{sub 0} and w{sub a}. For purely geometrical probes, and especially when marginalising over w{sub a}, we find considerable disagreement between the two methods, since in this case the Fisher matrix can not reproduce the highly non-elliptical shape of the likelihood function.more » More quantitatively, the Fisher method underestimates the marginalized errors for purely geometrical probes between 30%-70%. For cases including structure formation such as weak lensing, we find that the posterior probability contours from the Fisher matrix estimation are in good agreement with the MCMC contours and the forecasted errors only changing on the 5% level. We then explore non-linear transformations resulting in physically-motivated parameters and investigate whether these parameterisations exhibit a Gaussian behaviour. We conclude that for the purely geometrical probes and, more generally, in cases where it is not known whether the likelihood is close to Gaussian, the Fisher matrix is not the appropriate tool to produce reliable forecasts.« less
Hu, Meng; Clark, Kelsey L.; Gong, Xiajing; Noudoost, Behrad; Li, Mingyao; Moore, Tirin
2015-01-01
Inferotemporal (IT) neurons are known to exhibit persistent, stimulus-selective activity during the delay period of object-based working memory tasks. Frontal eye field (FEF) neurons show robust, spatially selective delay period activity during memory-guided saccade tasks. We present a copula regression paradigm to examine neural interaction of these two types of signals between areas IT and FEF of the monkey during a working memory task. This paradigm is based on copula models that can account for both marginal distribution over spiking activity of individual neurons within each area and joint distribution over ensemble activity of neurons between areas. Considering the popular GLMs as marginal models, we developed a general and flexible likelihood framework that uses the copula to integrate separate GLMs into a joint regression analysis. Such joint analysis essentially leads to a multivariate analog of the marginal GLM theory and hence efficient model estimation. In addition, we show that Granger causality between spike trains can be readily assessed via the likelihood ratio statistic. The performance of this method is validated by extensive simulations, and compared favorably to the widely used GLMs. When applied to spiking activity of simultaneously recorded FEF and IT neurons during working memory task, we observed significant Granger causality influence from FEF to IT, but not in the opposite direction, suggesting the role of the FEF in the selection and retention of visual information during working memory. The copula model has the potential to provide unique neurophysiological insights about network properties of the brain. PMID:26063909
Parameter Estimation for Thurstone Choice Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vojnovic, Milan; Yun, Seyoung
We consider the estimation accuracy of individual strength parameters of a Thurstone choice model when each input observation consists of a choice of one item from a set of two or more items (so called top-1 lists). This model accommodates the well-known choice models such as the Luce choice model for comparison sets of two or more items and the Bradley-Terry model for pair comparisons. We provide a tight characterization of the mean squared error of the maximum likelihood parameter estimator. We also provide similar characterizations for parameter estimators defined by a rank-breaking method, which amounts to deducing one ormore » more pair comparisons from a comparison of two or more items, assuming independence of these pair comparisons, and maximizing a likelihood function derived under these assumptions. We also consider a related binary classification problem where each individual parameter takes value from a set of two possible values and the goal is to correctly classify all items within a prescribed classification error. The results of this paper shed light on how the parameter estimation accuracy depends on given Thurstone choice model and the structure of comparison sets. In particular, we found that for unbiased input comparison sets of a given cardinality, when in expectation each comparison set of given cardinality occurs the same number of times, for a broad class of Thurstone choice models, the mean squared error decreases with the cardinality of comparison sets, but only marginally according to a diminishing returns relation. On the other hand, we found that there exist Thurstone choice models for which the mean squared error of the maximum likelihood parameter estimator can decrease much faster with the cardinality of comparison sets. We report empirical evaluation of some claims and key parameters revealed by theory using both synthetic and real-world input data from some popular sport competitions and online labor platforms.« less
Probabilistic treatment of the uncertainty from the finite size of weighted Monte Carlo data
NASA Astrophysics Data System (ADS)
Glüsenkamp, Thorsten
2018-06-01
Parameter estimation in HEP experiments often involves Monte Carlo simulation to model the experimental response function. A typical application are forward-folding likelihood analyses with re-weighting, or time-consuming minimization schemes with a new simulation set for each parameter value. Problematically, the finite size of such Monte Carlo samples carries intrinsic uncertainty that can lead to a substantial bias in parameter estimation if it is neglected and the sample size is small. We introduce a probabilistic treatment of this problem by replacing the usual likelihood functions with novel generalized probability distributions that incorporate the finite statistics via suitable marginalization. These new PDFs are analytic, and can be used to replace the Poisson, multinomial, and sample-based unbinned likelihoods, which covers many use cases in high-energy physics. In the limit of infinite statistics, they reduce to the respective standard probability distributions. In the general case of arbitrary Monte Carlo weights, the expressions involve the fourth Lauricella function FD, for which we find a new finite-sum representation in a certain parameter setting. The result also represents an exact form for Carlson's Dirichlet average Rn with n > 0, and thereby an efficient way to calculate the probability generating function of the Dirichlet-multinomial distribution, the extended divided difference of a monomial, or arbitrary moments of univariate B-splines. We demonstrate the bias reduction of our approach with a typical toy Monte Carlo problem, estimating the normalization of a peak in a falling energy spectrum, and compare the results with previously published methods from the literature.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
NASA Astrophysics Data System (ADS)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid–structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic systemmore » leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib–Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.« less
Estimating open population site occupancy from presence-absence data lacking the robust design.
Dail, D; Madsen, L
2013-03-01
Many animal monitoring studies seek to estimate the proportion of a study area occupied by a target population. The study area is divided into spatially distinct sites where the detected presence or absence of the population is recorded, and this is repeated in time for multiple seasons. However, when occupied sites are detected with probability p < 1, the lack of a detection does not imply lack of occupancy. MacKenzie et al. (2003, Ecology 84, 2200-2207) developed a multiseason model for estimating seasonal site occupancy (ψt ) while accounting for unknown p. Their model performs well when observations are collected according to the robust design, where multiple sampling occasions occur during each season; the repeated sampling aids in the estimation p. However, their model does not perform as well when the robust design is lacking. In this paper, we propose an alternative likelihood model that yields improved seasonal estimates of p and Ψt in the absence of the robust design. We construct the marginal likelihood of the observed data by conditioning on, and summing out, the latent number of occupied sites during each season. A simulation study shows that in cases without the robust design, the proposed model estimates p with less bias than the MacKenzie et al. model and hence improves the estimates of Ψt . We apply both models to a data set consisting of repeated presence-absence observations of American robins (Turdus migratorius) with yearly survey periods. The two models are compared to a third estimator available when the repeated counts (from the same study) are considered, with the proposed model yielding estimates of Ψt closest to estimates from the point count model. Copyright © 2013, The International Biometric Society.
Liu, Peigui; Elshall, Ahmed S.; Ye, Ming; ...
2016-02-05
Evaluating marginal likelihood is the most critical and computationally expensive task, when conducting Bayesian model averaging to quantify parametric and model uncertainties. The evaluation is commonly done by using Laplace approximations to evaluate semianalytical expressions of the marginal likelihood or by using Monte Carlo (MC) methods to evaluate arithmetic or harmonic mean of a joint likelihood function. This study introduces a new MC method, i.e., thermodynamic integration, which has not been attempted in environmental modeling. Instead of using samples only from prior parameter space (as in arithmetic mean evaluation) or posterior parameter space (as in harmonic mean evaluation), the thermodynamicmore » integration method uses samples generated gradually from the prior to posterior parameter space. This is done through a path sampling that conducts Markov chain Monte Carlo simulation with different power coefficient values applied to the joint likelihood function. The thermodynamic integration method is evaluated using three analytical functions by comparing the method with two variants of the Laplace approximation method and three MC methods, including the nested sampling method that is recently introduced into environmental modeling. The thermodynamic integration method outperforms the other methods in terms of their accuracy, convergence, and consistency. The thermodynamic integration method is also applied to a synthetic case of groundwater modeling with four alternative models. The application shows that model probabilities obtained using the thermodynamic integration method improves predictive performance of Bayesian model averaging. As a result, the thermodynamic integration method is mathematically rigorous, and its MC implementation is computationally general for a wide range of environmental problems.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-03
... dumping margins likely to prevail is indicated in the ``Final Results of Sunset Review'' section of this... likelihood of continuation or recurrence of dumping and the magnitude of the margins likely to prevail if the... dumping margin likely to prevail [FR Doc. 2013-28952 Filed 12-2-13; 8:45 am] BILLING CODE 3510-DS-P ...
Doubly Robust and Efficient Estimation of Marginal Structural Models for the Hazard Function
Zheng, Wenjing; Petersen, Maya; van der Laan, Mark
2016-01-01
In social and health sciences, many research questions involve understanding the causal effect of a longitudinal treatment on mortality (or time-to-event outcomes in general). Often, treatment status may change in response to past covariates that are risk factors for mortality, and in turn, treatment status may also affect such subsequent covariates. In these situations, Marginal Structural Models (MSMs), introduced by Robins (1997), are well-established and widely used tools to account for time-varying confounding. In particular, a MSM can be used to specify the intervention-specific counterfactual hazard function, i.e. the hazard for the outcome of a subject in an ideal experiment where he/she was assigned to follow a given intervention on their treatment variables. The parameters of this hazard MSM are traditionally estimated using the Inverse Probability Weighted estimation (IPTW, van der Laan and Petersen (2007), Robins et al. (2000b), Robins (1999), Robins et al. (2008)). This estimator is easy to implement and admits Wald-type confidence intervals. However, its consistency hinges on the correct specification of the treatment allocation probabilities, and the estimates are generally sensitive to large treatment weights (especially in the presence of strong confounding), which are difficult to stabilize for dynamic treatment regimes. In this paper, we present a pooled targeted maximum likelihood estimator (TMLE, van der Laan and Rubin (2006)) for MSM for the hazard function under longitudinal dynamic treatment regimes. The proposed estimator is semiparametric efficient and doubly robust, hence offers bias reduction and efficiency gain over the incumbent IPTW estimator. Moreover, the substitution principle rooted in the TMLE potentially mitigates the sensitivity to large treatment weights in IPTW. We compare the performance of the proposed estimator with the IPTW and a non-targeted substitution estimator in a simulation study. PMID:27227723
ARMA models for earthquake ground motions. Seismic safety margins research program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, M. K.; Kwiatkowski, J. W.; Nau, R. F.
1981-02-01
Four major California earthquake records were analyzed by use of a class of discrete linear time-domain processes commonly referred to as ARMA (Autoregressive/Moving-Average) models. It was possible to analyze these different earthquakes, identify the order of the appropriate ARMA model(s), estimate parameters, and test the residuals generated by these models. It was also possible to show the connections, similarities, and differences between the traditional continuous models (with parameter estimates based on spectral analyses) and the discrete models with parameters estimated by various maximum-likelihood techniques applied to digitized acceleration data in the time domain. The methodology proposed is suitable for simulatingmore » earthquake ground motions in the time domain, and appears to be easily adapted to serve as inputs for nonlinear discrete time models of structural motions. 60 references, 19 figures, 9 tables.« less
Essays on pricing dynamics, price dispersion, and nested logit modelling
NASA Astrophysics Data System (ADS)
Verlinda, Jeremy Alan
The body of this dissertation comprises three standalone essays, presented in three respective chapters. Chapter One explores the possibility that local market power contributes to the asymmetric relationship observed between wholesale costs and retail prices in gasoline markets. I exploit an original data set of weekly gas station prices in Southern California from September 2002 to May 2003, and take advantage of highly detailed station and local market-level characteristics to determine the extent to which spatial differentiation influences price-response asymmetry. I find that brand identity, proximity to rival stations, bundling and advertising, operation type, and local market features and demographics each influence a station's predicted asymmetric relationship between prices and wholesale costs. Chapter Two extends the existing literature on the effect of market structure on price dispersion in airline fares by modeling the effect at the disaggregate ticket level. Whereas past studies rely on aggregate measures of price dispersion such as the Gini coefficient or the standard deviation of fares, this paper estimates the entire empirical distribution of airline fares and documents how the shape of the distribution is determined by market structure. Specifically, I find that monopoly markets favor a wider distribution of fares with more mass in the tails while duopoly and competitive markets exhibit a tighter fare distribution. These findings indicate that the dispersion of airline fares may result from the efforts of airlines to practice second-degree price discrimination. Chapter Three adopts a Bayesian approach to the problem of tree structure specification in nested logit modelling, which requires a heavy computational burden in calculating marginal likelihoods. I compare two different techniques for estimating marginal likelihoods: (1) the Laplace approximation, and (2) reversible jump MCMC. I apply the techniques to both a simulated and a travel mode choice data set, and find that model selection is invariant to prior specification, while model derivatives like willingness-to-pay are notably sensitive to model choice. I also find that the Laplace approximation is remarkably accurate in spite of the potential for nested logit models to have irregular likelihood surfaces.
[Resection margins in conservative breast cancer surgery].
Medina Fernández, Francisco Javier; Ayllón Terán, María Dolores; Lombardo Galera, María Sagrario; Rioja Torres, Pilar; Bascuñana Estudillo, Guillermo; Rufián Peña, Sebastián
2013-01-01
Conservative breast cancer surgery is facing a new problem: the potential tumour involvement of resection margins. This eventuality has been closely and negatively associated with disease-free survival. Various factors may influence the likelihood of margins being affected, mostly related to the characteristics of the tumour, patient or surgical technique. In the last decade, many studies have attempted to find predictive factors for margin involvement. However, it is currently the new techniques used in the study of margins and tumour localisation that are significantly reducing reoperations in conservative breast cancer surgery. Copyright © 2012 AEC. Published by Elsevier Espana. All rights reserved.
Pillow, Jonathan W; Ahmadian, Yashar; Paninski, Liam
2011-01-01
One of the central problems in systems neuroscience is to understand how neural spike trains convey sensory information. Decoding methods, which provide an explicit means for reading out the information contained in neural spike responses, offer a powerful set of tools for studying the neural coding problem. Here we develop several decoding methods based on point-process neural encoding models, or forward models that predict spike responses to stimuli. These models have concave log-likelihood functions, which allow efficient maximum-likelihood model fitting and stimulus decoding. We present several applications of the encoding model framework to the problem of decoding stimulus information from population spike responses: (1) a tractable algorithm for computing the maximum a posteriori (MAP) estimate of the stimulus, the most probable stimulus to have generated an observed single- or multiple-neuron spike train response, given some prior distribution over the stimulus; (2) a gaussian approximation to the posterior stimulus distribution that can be used to quantify the fidelity with which various stimulus features are encoded; (3) an efficient method for estimating the mutual information between the stimulus and the spike trains emitted by a neural population; and (4) a framework for the detection of change-point times (the time at which the stimulus undergoes a change in mean or variance) by marginalizing over the posterior stimulus distribution. We provide several examples illustrating the performance of these estimators with simulated and real neural data.
Estimating the variance for heterogeneity in arm-based network meta-analysis.
Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R
2018-04-19
Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach
Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao
2018-01-01
When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach, and has several attractive features compared to the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, since the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. PMID:26303591
Determining the accuracy of maximum likelihood parameter estimates with colored residuals
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; Klein, Vladislav
1994-01-01
An important part of building high fidelity mathematical models based on measured data is calculating the accuracy associated with statistical estimates of the model parameters. Indeed, without some idea of the accuracy of parameter estimates, the estimates themselves have limited value. In this work, an expression based on theoretical analysis was developed to properly compute parameter accuracy measures for maximum likelihood estimates with colored residuals. This result is important because experience from the analysis of measured data reveals that the residuals from maximum likelihood estimation are almost always colored. The calculations involved can be appended to conventional maximum likelihood estimation algorithms. Simulated data runs were used to show that the parameter accuracy measures computed with this technique accurately reflect the quality of the parameter estimates from maximum likelihood estimation without the need for analysis of the output residuals in the frequency domain or heuristically determined multiplication factors. The result is general, although the application studied here is maximum likelihood estimation of aerodynamic model parameters from flight test data.
On the existence of maximum likelihood estimates for presence-only data
Hefley, Trevor J.; Hooten, Mevin B.
2015-01-01
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence-only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
Evolution of marginal populations of an invasive vine increases the likelihood of future spread
Francis F. Kilkenny; Laura F. Galloway
2015-01-01
The prediction of invasion patterns may require an understanding of intraspecific differentiation in invasive species and its interaction with climate change. We compare Japanese honeysuckle (Lonicera japonica) plants from the core (100-150 yr old) and northern margin (< 65 yr old) of their North American invaded range to determine whether evolution...
Estimating Function Approaches for Spatial Point Processes
NASA Astrophysics Data System (ADS)
Deng, Chong
Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
Maldonado, Alejandra; Collins, Timothy W.; Grineski, Sara E.; Chakraborty, Jayajit
2016-01-01
Although numerous studies have been conducted on the vulnerability of marginalized groups in the environmental justice (EJ) and hazards fields, analysts have tended to lump people together in broad racial/ethnic categories without regard for substantial within-group heterogeneity. This paper addresses that limitation by examining whether Hispanic immigrants are disproportionately exposed to risks from flood hazards relative to other racial/ethnic groups (including US-born Hispanics), adjusting for relevant covariates. Survey data were collected for 1283 adult householders in the Houston and Miami Metropolitan Statistical Areas (MSAs) and flood risk was estimated using their residential presence/absence within federally-designated 100-year flood zones. Generalized estimating equations (GEE) with binary logistic specifications that adjust for county-level clustering were used to analyze (separately) and compare the Houston (N = 546) and Miami (N = 560) MSAs in order to clarify determinants of household exposure to flood risk. GEE results in Houston indicate that Hispanic immigrants have the greatest likelihood, and non-Hispanic Whites the least likelihood, of residing in a 100-year flood zone. Miami GEE results contrastingly reveal that non-Hispanic Whites have a significantly greater likelihood of residing in a flood zone when compared to Hispanic immigrants. These divergent results suggest that human-flood hazard relationships have been structured differently between the two MSAs, possibly due to the contrasting role that water-based amenities have played in urbanization within the two study areas. Future EJ research and practice should differentiate between Hispanic subgroups based on nativity status and attend to contextual factors influencing environmental risk disparities. PMID:27490561
Maldonado, Alejandra; Collins, Timothy W; Grineski, Sara E; Chakraborty, Jayajit
2016-08-01
Although numerous studies have been conducted on the vulnerability of marginalized groups in the environmental justice (EJ) and hazards fields, analysts have tended to lump people together in broad racial/ethnic categories without regard for substantial within-group heterogeneity. This paper addresses that limitation by examining whether Hispanic immigrants are disproportionately exposed to risks from flood hazards relative to other racial/ethnic groups (including US-born Hispanics), adjusting for relevant covariates. Survey data were collected for 1283 adult householders in the Houston and Miami Metropolitan Statistical Areas (MSAs) and flood risk was estimated using their residential presence/absence within federally-designated 100-year flood zones. Generalized estimating equations (GEE) with binary logistic specifications that adjust for county-level clustering were used to analyze (separately) and compare the Houston (N = 546) and Miami (N = 560) MSAs in order to clarify determinants of household exposure to flood risk. GEE results in Houston indicate that Hispanic immigrants have the greatest likelihood, and non-Hispanic Whites the least likelihood, of residing in a 100-year flood zone. Miami GEE results contrastingly reveal that non-Hispanic Whites have a significantly greater likelihood of residing in a flood zone when compared to Hispanic immigrants. These divergent results suggest that human-flood hazard relationships have been structured differently between the two MSAs, possibly due to the contrasting role that water-based amenities have played in urbanization within the two study areas. Future EJ research and practice should differentiate between Hispanic subgroups based on nativity status and attend to contextual factors influencing environmental risk disparities.
MIXREG: a computer program for mixed-effects regression analysis with autocorrelated errors.
Hedeker, D; Gibbons, R D
1996-05-01
MIXREG is a program that provides estimates for a mixed-effects regression model (MRM) for normally-distributed response data including autocorrelated errors. This model can be used for analysis of unbalanced longitudinal data, where individuals may be measured at a different number of timepoints, or even at different timepoints. Autocorrelated errors of a general form or following an AR(1), MA(1), or ARMA(1,1) form are allowable. This model can also be used for analysis of clustered data, where the mixed-effects model assumes data within clusters are dependent. The degree of dependency is estimated jointly with estimates of the usual model parameters, thus adjusting for clustering. MIXREG uses maximum marginal likelihood estimation, utilizing both the EM algorithm and a Fisher-scoring solution. For the scoring solution, the covariance matrix of the random effects is expressed in its Gaussian decomposition, and the diagonal matrix reparameterized using the exponential transformation. Estimation of the individual random effects is accomplished using an empirical Bayes approach. Examples illustrating usage and features of MIXREG are provided.
Efficient parameter estimation in longitudinal data analysis using a hybrid GEE method.
Leung, Denis H Y; Wang, You-Gan; Zhu, Min
2009-07-01
The method of generalized estimating equations (GEEs) provides consistent estimates of the regression parameters in a marginal regression model for longitudinal data, even when the working correlation model is misspecified (Liang and Zeger, 1986). However, the efficiency of a GEE estimate can be seriously affected by the choice of the working correlation model. This study addresses this problem by proposing a hybrid method that combines multiple GEEs based on different working correlation models, using the empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working correlation structures correctly models the within-subject correlations, then this hybrid method provides the most efficient parameter estimates. In simulations, the hybrid method's finite-sample performance is superior to a GEE under any of the commonly used working correlation models and is almost fully efficient in all scenarios studied. The hybrid method is illustrated using data from a longitudinal study of the respiratory infection rates in 275 Indonesian children.
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
The Role of Margin in Link Design and Optimization
NASA Technical Reports Server (NTRS)
Cheung, K.
2015-01-01
Link analysis is a system engineering process in the design, development, and operation of communication systems and networks. Link models that are mathematical abstractions representing the useful signal power and the undesirable noise and attenuation effects (including weather effects if the signal path transverses through the atmosphere) that are integrated into the link budget calculation that provides the estimates of signal power and noise power at the receiver. Then the link margin is applied which attempts to counteract the fluctuations of the signal and noise power to ensure reliable data delivery from transmitter to receiver. (Link margin is dictated by the link margin policy or requirements.) A simple link budgeting approach assumes link parameters to be deterministic values typically adopted a rule-of-thumb policy of 3 dB link margin. This policy works for most S- and X-band links due to their insensitivity to weather effects. But for higher frequency links like Ka-band, Ku-band, and optical communication links, it is unclear if a 3 dB link margin would guarantee link closure. Statistical link analysis that adopted the 2-sigma or 3-sigma link margin incorporates link uncertainties in the sigma calculation. (The Deep Space Network (DSN) link margin policies are 2-sigma for downlink and 3-sigma for uplink.) The link reliability can therefore be quantified statistically even for higher frequency links. However in the current statistical link analysis approach, link reliability is only expressed as the likelihood of exceeding the signal-to-noise ratio (SNR) threshold that corresponds to a given bit-error-rate (BER) or frame-error-rate (FER) requirement. The method does not provide the true BER or FER estimate of the link with margin, or the required signalto-noise ratio (SNR) that would meet the BER or FER requirement in the statistical sense. In this paper, we perform in-depth analysis on the relationship between BER/FER requirement, operating SNR, and coding performance curve, in the case when the channel coherence time of link fluctuation is comparable or larger than the time duration of a codeword. We compute the "true" SNR design point that would meet the BER/FER requirement by taking into account the fluctuation of signal power and noise power at the receiver, and the shape of the coding performance curve. This analysis yields a number of valuable insights on the design choices of coding scheme and link margin for the reliable data delivery of a communication system - space and ground. We illustrate the aforementioned analysis using a number of standard NASA error-correcting codes.
Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach.
Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao
2016-01-15
When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.
Tang, Hongxiu; Cai, Weibin; Wang, Hongjing; Zhang, Qing; Qian, Ling; Shell, Duane F; Newman, Ian M; Yin, Ping
2013-01-01
This study examines the association between cultural orientation and drinking behaviors among university students. Cultural orientation is the measure of how the cultural values of individuals living in their own society are influenced by cultural values introduced from the outside. In 2011, a cross-sectional survey collected data from 1279 university students from six universities in central China. Participants used a likert scale to rank a series of statements reflecting cultural values from the previously validated Chinese Cultural Orientation Scale and answered questions about their drinking behaviors and socio-demographic characteristics. Statistically significant differences in cultural orientation were observed for gender, hometown and type of university attendance. Traditional-oriented students were more likely to be occasional drinkers or nondrinkers, while marginal-oriented students, bicultural-oriented students and western-oriented students were more likely to be regular drinkers. Bicultural orientation (OR = 1.80, P<0.05) and marginal orientation (OR = 1.64, P<0.05) increased the likelihood of the student being regular drinking, compared to students with traditional orientations. Males (OR = 4.40, P<0.05) had a higher likelihood of regular drinking than females, graduate students (OR = 2.59, P<0.05) had a higher likelihood of regular drinking than undergraduates, students from urban areas (OR = 1.79, P<0.05) had a higher likelihood of regular drinking than those from towns/rural areas, and students attending key universities (OR = 0.48, P<0.05) had a lower likelihood of regular drinking than those attending general universities. Cultural orientation influences drinking behaviors. Traditional cultural orientation was associated with less drinking while western cultural orientation, marginal cultural orientation and bicultural orientation were associated with more drinking. The role of gender, hometown and university attendance is partially moderated through the influence of cultural orientation. The relationship between a traditional cultural orientation and alcohol drinking suggests that traditional Chinese cultural values should be examined for their role in possibly reducing alcohol-related risks through education and policy initiatives.
Maximum entropy approach to statistical inference for an ocean acoustic waveguide.
Knobles, D P; Sagers, J D; Koch, R A
2012-02-01
A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America
Multisensor fusion for 3D target tracking using track-before-detect particle filter
NASA Astrophysics Data System (ADS)
Moshtagh, Nima; Romberg, Paul M.; Chan, Moses W.
2015-05-01
This work presents a novel fusion mechanism for estimating the three-dimensional trajectory of a moving target using images collected by multiple imaging sensors. The proposed projective particle filter avoids the explicit target detection prior to fusion. In projective particle filter, particles that represent the posterior density (of target state in a high-dimensional space) are projected onto the lower-dimensional observation space. Measurements are generated directly in the observation space (image plane) and a marginal (sensor) likelihood is computed. The particles states and their weights are updated using the joint likelihood computed from all the sensors. The 3D state estimate of target (system track) is then generated from the states of the particles. This approach is similar to track-before-detect particle filters that are known to perform well in tracking dim and stealthy targets in image collections. Our approach extends the track-before-detect approach to 3D tracking using the projective particle filter. The performance of this measurement-level fusion method is compared with that of a track-level fusion algorithm using the projective particle filter. In the track-level fusion algorithm, the 2D sensor tracks are generated separately and transmitted to a fusion center, where they are treated as measurements to the state estimator. The 2D sensor tracks are then fused to reconstruct the system track. A realistic synthetic scenario with a boosting target was generated, and used to study the performance of the fusion mechanisms.
A mixed-effects regression model for longitudinal multivariate ordinal data.
Liu, Li C; Hedeker, Donald
2006-03-01
A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate were considered. These equations suggest certain successive approximations iterative procedures for obtaining maximum likelihood estimates. The procedures, which are generalized steepest ascent (deflected gradient) procedures, contain those of Hosmer as a special case.
NASA Astrophysics Data System (ADS)
Feeney, Stephen M.; Mortlock, Daniel J.; Dalmasso, Niccolò
2018-05-01
Estimates of the Hubble constant, H0, from the local distance ladder and from the cosmic microwave background (CMB) are discrepant at the ˜3σ level, indicating a potential issue with the standard Λ cold dark matter (ΛCDM) cosmology. A probabilistic (i.e. Bayesian) interpretation of this tension requires a model comparison calculation, which in turn depends strongly on the tails of the H0 likelihoods. Evaluating the tails of the local H0 likelihood requires the use of non-Gaussian distributions to faithfully represent anchor likelihoods and outliers, and simultaneous fitting of the complete distance-ladder data set to ensure correct uncertainty propagation. We have hence developed a Bayesian hierarchical model of the full distance ladder that does not rely on Gaussian distributions and allows outliers to be modelled without arbitrary data cuts. Marginalizing over the full ˜3000-parameter joint posterior distribution, we find H0 = (72.72 ± 1.67) km s-1 Mpc-1 when applied to the outlier-cleaned Riess et al. data, and (73.15 ± 1.78) km s-1 Mpc-1 with supernova outliers reintroduced (the pre-cut Cepheid data set is not available). Using our precise evaluation of the tails of the H0 likelihood, we apply Bayesian model comparison to assess the evidence for deviation from ΛCDM given the distance-ladder and CMB data. The odds against ΛCDM are at worst ˜10:1 when considering the Planck 2015 XIII data, regardless of outlier treatment, considerably less dramatic than naïvely implied by the 2.8σ discrepancy. These odds become ˜60:1 when an approximation to the more-discrepant Planck Intermediate XLVI likelihood is included.
Estimating the cost-effectiveness of 54 weeks of infliximab for rheumatoid arthritis.
Wong, John B; Singh, Gurkirpal; Kavanaugh, Arthur
2002-10-01
To estimate the cost-effectiveness of infliximab plus methotrexate for active, refractory rheumatoid arthritis. We projected the 54-week results from a randomized controlled trial of infliximab into lifetime economic and clinical outcomes using a Markov computer simulation model. Direct and indirect costs, quality of life, and disability estimates were based on trial results; Arthritis, Rheumatism, and Aging Medical Information System (ARAMIS) database outcomes; and published data. Results were discounted using the standard 3% rate. Because most well-accepted medical therapies have cost-effectiveness ratios below $50,000 to $100,000 per quality-adjusted life-year (QALY) gained, results below this range were considered to be "cost-effective." At 3 mg/kg, each infliximab infusion would cost $1393. When compared with methotrexate alone, 54 weeks of infliximab plus methotrexate decreased the likelihood of having advanced disability from 23% to 11% at the end of 54 weeks, which projected to a lifetime marginal cost-effectiveness ratio of $30,500 per discounted QALY gained, considering only direct medical costs. When applying a societal perspective and including indirect or productivity costs, the marginal cost-effectiveness ratio for infliximab was $9100 per discounted QALY gained. The results remained relatively unchanged with variation of model estimates over a broad range of values. Infliximab plus methotrexate for 54 weeks for rheumatoid arthritis should be cost-effective with its clinical benefit providing good value for the drug cost, especially when including productivity losses. Although infliximab beyond 54 weeks will likely be cost-effective, the economic and clinical benefit remains uncertain and will depend on long-term results of clinical trials.
Design of simplified maximum-likelihood receivers for multiuser CPM systems.
Bing, Li; Bai, Baoming
2014-01-01
A class of simplified maximum-likelihood receivers designed for continuous phase modulation based multiuser systems is proposed. The presented receiver is built upon a front end employing mismatched filters and a maximum-likelihood detector defined in a low-dimensional signal space. The performance of the proposed receivers is analyzed and compared to some existing receivers. Some schemes are designed to implement the proposed receivers and to reveal the roles of different system parameters. Analysis and numerical results show that the proposed receivers can approach the optimum multiuser receivers with significantly (even exponentially in some cases) reduced complexity and marginal performance degradation.
Li, Jun; Fu, Cuizhang; Lei, Guangchun
2011-01-01
Few studies have explored the role of Cenozoic tectonic evolution in shaping patterns and processes of extant animal distributions within East Asian margins. We select Hynobius salamanders (Amphibia: Hynobiidae) as a model to examine biogeographical consequences of Cenozoic tectonic events within East Asian margins. First, we use GenBank molecular data to reconstruct phylogenetic interrelationships of Hynobius by Bayesian and maximum likelihood analyses. Second, we estimate the divergence time using the Bayesian relaxed clock approach and infer dispersal/vicariance histories under the ‘dispersal–extinction–cladogenesis’ model. Finally, we test whether evolutionary history and biogeographical processes of Hynobius should coincide with the predictions of two major hypotheses (the ‘vicariance’/‘out of southwestern Japan’ hypothesis). The resulting phylogeny confirmed Hynobius as a monophyletic group, which could be divided into nine major clades associated with six geographical areas. Our results show that: (1) the most recent common ancestor of Hynobius was distributed in southwestern Japan and Hokkaido Island, (2) a sister taxon relationship between Hynobius retardatus and all remaining species was the results of a vicariance event between Hokkaido Island and southwestern Japan in the Middle Eocene, (3) ancestral Hynobius in southwestern Japan dispersed into the Taiwan Island, central China, ‘Korean Peninsula and northeastern China’ as well as northeastern Honshu during the Late Eocene–Late Miocene. Our findings suggest that Cenozoic tectonic evolution plays an important role in shaping disjunctive distributions of extant Hynobius within East Asian margins. PMID:21738684
Jeon, Jihyoun; Hsu, Li; Gorfine, Malka
2012-07-01
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Comoving Stars in Gaia DR1: An Abundance of Very Wide Separation Comoving Pairs
NASA Astrophysics Data System (ADS)
Oh, Semyeong; Price-Whelan, Adrian M.; Hogg, David W.; Morton, Timothy D.; Spergel, David N.
2017-06-01
The primary sample of the Gaia Data Release 1 is the Tycho-Gaia Astrometric Solution (TGAS): ≈2 million Tycho-2 sources with improved parallaxes and proper motions relative to the initial catalog. This increased astrometric precision presents an opportunity to find new binary stars and moving groups. We search for high-confidence comoving pairs of stars in TGAS by identifying pairs of stars consistent with having the same 3D velocity using a marginalized likelihood ratio test to discriminate candidate comoving pairs from the field population. Although we perform some visualizations using (bias-corrected) inverse parallax as a point estimate of distance, the likelihood ratio is computed with a probabilistic model that includes the covariances of parallax and proper motions and marginalizes the (unknown) true distances and 3D velocities of the stars. We find 13,085 comoving star pairs among 10,606 unique stars with separations as large as 10 pc (our search limit). Some of these pairs form larger groups through mutual comoving neighbors: many of these pair networks correspond to known open clusters and OB associations, but we also report the discovery of several new comoving groups. Most surprisingly, we find a large number of very wide (> 1 pc) separation comoving star pairs, the number of which increases with increasing separation and cannot be explained purely by false-positive contamination. Our key result is a catalog of high-confidence comoving pairs of stars in TGAS. We discuss the utility of this catalog for making dynamical inferences about the Galaxy, testing stellar atmosphere models, and validating chemical abundance measurements.
Hollenbeak, Christopher S
2005-10-15
While risk-adjusted outcomes are often used to compare the performance of hospitals and physicians, the most appropriate functional form for the risk adjustment process is not always obvious for continuous outcomes such as costs. Semi-log models are used most often to correct skewness in cost data, but there has been limited research to determine whether the log transformation is sufficient or whether another transformation is more appropriate. This study explores the most appropriate functional form for risk-adjusting the cost of coronary artery bypass graft (CABG) surgery. Data included patients undergoing CABG surgery at four hospitals in the midwest and were fit to a Box-Cox model with random coefficients (BCRC) using Markov chain Monte Carlo methods. Marginal likelihoods and Bayes factors were computed to perform model comparison of alternative model specifications. Rankings of hospital performance were created from the simulation output and the rankings produced by Bayesian estimates were compared to rankings produced by standard models fit using classical methods. Results suggest that, for these data, the most appropriate functional form is not logarithmic, but corresponds to a Box-Cox transformation of -1. Furthermore, Bayes factors overwhelmingly rejected the natural log transformation. However, the hospital ranking induced by the BCRC model was not different from the ranking produced by maximum likelihood estimates of either the linear or semi-log model. Copyright (c) 2005 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Sadegh, M.; Moftakhari, H.; AghaKouchak, A.
2017-12-01
Many natural hazards are driven by multiple forcing variables, and concurrence/consecutive extreme events significantly increases risk of infrastructure/system failure. It is a common practice to use univariate analysis based upon a perceived ruling driver to estimate design quantiles and/or return periods of extreme events. A multivariate analysis, however, permits modeling simultaneous occurrence of multiple forcing variables. In this presentation, we introduce the Multi-hazard Assessment and Scenario Toolbox (MhAST) that comprehensively analyzes marginal and joint probability distributions of natural hazards. MhAST also offers a wide range of scenarios of return period and design levels and their likelihoods. Contribution of this study is four-fold: 1. comprehensive analysis of marginal and joint probability of multiple drivers through 17 continuous distributions and 26 copulas, 2. multiple scenario analysis of concurrent extremes based upon the most likely joint occurrence, one ruling variable, and weighted random sampling of joint occurrences with similar exceedance probabilities, 3. weighted average scenario analysis based on a expected event, and 4. uncertainty analysis of the most likely joint occurrence scenario using a Bayesian framework.
The costs of turnover in nursing homes.
Mukamel, Dana B; Spector, William D; Limcangco, Rhona; Wang, Ying; Feng, Zhanlian; Mor, Vincent
2009-10-01
Turnover rates in nursing homes have been persistently high for decades, ranging upwards of 100%. To estimate the net costs associated with turnover of direct care staff in nursing homes. DATA AND SAMPLE: Nine hundred two nursing homes in California in 2005. Data included Medicaid cost reports, the Minimum Data Set, Medicare enrollment files, Census, and Area Resource File. We estimated total cost functions, which included in addition to exogenous outputs and wages, the facility turnover rate. Instrumental variable limited information maximum likelihood techniques were used for estimation to deal with the endogeneity of turnover and costs. The cost functions exhibited the expected behavior, with initially increasing and then decreasing returns to scale. The ordinary least square estimate did not show a significant association between costs and turnover. The instrumental variable estimate of turnover costs was negative and significant (P = 0.039). The marginal cost savings associated with a 10% point increase in turnover for an average facility was $167,063 or 2.9% of annual total costs. The net savings associated with turnover offer an explanation for the persistence of this phenomenon over the last decades, despite the many policy initiatives to reduce it. Future policy efforts need to recognize the complex relationship between turnover and costs.
Data analysis in emission tomography using emission-count posteriors
NASA Astrophysics Data System (ADS)
Sitek, Arkadiusz
2012-11-01
A novel approach to the analysis of emission tomography data using the posterior probability of the number of emissions per voxel (emission count) conditioned on acquired tomographic data is explored. The posterior is derived from the prior and the Poisson likelihood of the emission-count data by marginalizing voxel activities. Based on emission-count posteriors, examples of Bayesian analysis including estimation and classification tasks in emission tomography are provided. The application of the method to computer simulations of 2D tomography is demonstrated. In particular, the minimum-mean-square-error point estimator of the emission count is demonstrated. The process of finding this estimator can be considered as a tomographic image reconstruction technique since the estimates of the number of emissions per voxel divided by voxel sensitivities and acquisition time are the estimates of the voxel activities. As an example of a classification task, a hypothesis stating that some region of interest (ROI) emitted at least or at most r-times the number of events in some other ROI is tested. The ROIs are specified by the user. The analysis described in this work provides new quantitative statistical measures that can be used in decision making in diagnostic imaging using emission tomography.
Woodhead, Jeffrey L; Paech, Franziska; Maurer, Martina; Engelhardt, Marc; Schmitt-Hoffmann, Anne H; Spickermann, Jochen; Messner, Simon; Wind, Mathias; Witschi, Anne-Therese; Krähenbühl, Stephan; Siler, Scott Q; Watkins, Paul B; Howell, Brett A
2018-06-07
Elevations of liver enzymes have been observed in clinical trials with BAL30072, a novel antibiotic. In vitro assays have identified potential mechanisms for the observed hepatotoxicity, including electron transport chain (ETC) inhibition and reactive oxygen species (ROS) generation. DILIsym, a quantitative systems pharmacology (QSP) model of drug-induced liver injury, has been used to predict the likelihood that each mechanism explains the observed toxicity. DILIsym was also used to predict the safety margin for a novel BAL30072 dosing scheme; it was predicted to be low. DILIsym was then used to recommend potential modifications to this dosing scheme; weight-adjusted dosing and a requirement to assay plasma alanine aminotransferase (ALT) daily and stop dosing as soon as ALT increases were observed improved the predicted safety margin of BAL30072 and decreased the predicted likelihood of severe injury. This research demonstrates a potential application for QSP modeling in improving the safety profile of candidate drugs. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Margin based ontology sparse vector learning algorithm and applied in biology science.
Gao, Wei; Qudair Baig, Abdul; Ali, Haidar; Sajjad, Wasim; Reza Farahani, Mohammad
2017-01-01
In biology field, the ontology application relates to a large amount of genetic information and chemical information of molecular structure, which makes knowledge of ontology concepts convey much information. Therefore, in mathematical notation, the dimension of vector which corresponds to the ontology concept is often very large, and thus improves the higher requirements of ontology algorithm. Under this background, we consider the designing of ontology sparse vector algorithm and application in biology. In this paper, using knowledge of marginal likelihood and marginal distribution, the optimized strategy of marginal based ontology sparse vector learning algorithm is presented. Finally, the new algorithm is applied to gene ontology and plant ontology to verify its efficiency.
Efficient Bayesian experimental design for contaminant source identification
NASA Astrophysics Data System (ADS)
Zhang, J.; Zeng, L.
2013-12-01
In this study, an efficient full Bayesian approach is developed for the optimal sampling well location design and source parameter identification of groundwater contaminants. An information measure, i.e., the relative entropy, is employed to quantify the information gain from indirect concentration measurements in identifying unknown source parameters such as the release time, strength and location. In this approach, the sampling location that gives the maximum relative entropy is selected as the optimal one. Once the sampling location is determined, a Bayesian approach based on Markov Chain Monte Carlo (MCMC) is used to estimate unknown source parameters. In both the design and estimation, the contaminant transport equation is required to be solved many times to evaluate the likelihood. To reduce the computational burden, an interpolation method based on the adaptive sparse grid is utilized to construct a surrogate for the contaminant transport. The approximated likelihood can be evaluated directly from the surrogate, which greatly accelerates the design and estimation process. The accuracy and efficiency of our approach are demonstrated through numerical case studies. Compared with the traditional optimal design, which is based on the Gaussian linear assumption, the method developed in this study can cope with arbitrary nonlinearity. It can be used to assist in groundwater monitor network design and identification of unknown contaminant sources. Contours of the expected information gain. The optimal observing location corresponds to the maximum value. Posterior marginal probability densities of unknown parameters, the thick solid black lines are for the designed location. For comparison, other 7 lines are for randomly chosen locations. The true values are denoted by vertical lines. It is obvious that the unknown parameters are estimated better with the desinged location.
ERIC Educational Resources Information Center
Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S.
2016-01-01
The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
On non-parametric maximum likelihood estimation of the bivariate survivor function.
Prentice, R L
The likelihood function for the bivariate survivor function F, under independent censorship, is maximized to obtain a non-parametric maximum likelihood estimator &Fcirc;. &Fcirc; may or may not be unique depending on the configuration of singly- and doubly-censored pairs. The likelihood function can be maximized by placing all mass on the grid formed by the uncensored failure times, or half lines beyond the failure time grid, or in the upper right quadrant beyond the grid. By accumulating the mass along lines (or regions) where the likelihood is flat, one obtains a partially maximized likelihood as a function of parameters that can be uniquely estimated. The score equations corresponding to these point mass parameters are derived, using a Lagrange multiplier technique to ensure unit total mass, and a modified Newton procedure is used to calculate the parameter estimates in some limited simulation studies. Some considerations for the further development of non-parametric bivariate survivor function estimators are briefly described.
N-mixture models for estimating population size from spatially replicated counts
Royle, J. Andrew
2004-01-01
Spatial replication is a common theme in count surveys of animals. Such surveys often generate sparse count data from which it is difficult to estimate population size while formally accounting for detection probability. In this article, i describe a class of models (n-mixture models) which allow for estimation of population size from such data. The key idea is to view site-specific population sizes, n, as independent random variables distributed according to some mixing distribution (e.g., Poisson). Prior parameters are estimated from the marginal likelihood of the data, having integrated over the prior distribution for n. Carroll and lombard (1985, journal of american statistical association 80, 423-426) proposed a class of estimators based on mixing over a prior distribution for detection probability. Their estimator can be applied in limited settings, but is sensitive to prior parameter values that are fixed a priori. Spatial replication provides additional information regarding the parameters of the prior distribution on n that is exploited by the n-mixture models and which leads to reasonable estimates of abundance from sparse data. A simulation study demonstrates superior operating characteristics (bias, confidence interval coverage) of the n-mixture estimator compared to the caroll and lombard estimator. Both estimators are applied to point count data on six species of birds illustrating the sensitivity to choice of prior on p and substantially different estimates of abundance as a consequence.
NASA Astrophysics Data System (ADS)
Pankow, C.; Brady, P.; Ochsner, E.; O'Shaughnessy, R.
2015-07-01
We introduce a highly parallelizable architecture for estimating parameters of compact binary coalescence using gravitational-wave data and waveform models. Using a spherical harmonic mode decomposition, the waveform is expressed as a sum over modes that depend on the intrinsic parameters (e.g., masses) with coefficients that depend on the observer dependent extrinsic parameters (e.g., distance, sky position). The data is then prefiltered against those modes, at fixed intrinsic parameters, enabling efficiently evaluation of the likelihood for generic source positions and orientations, independent of waveform length or generation time. We efficiently parallelize our intrinsic space calculation by integrating over all extrinsic parameters using a Monte Carlo integration strategy. Since the waveform generation and prefiltering happens only once, the cost of integration dominates the procedure. Also, we operate hierarchically, using information from existing gravitational-wave searches to identify the regions of parameter space to emphasize in our sampling. As proof of concept and verification of the result, we have implemented this algorithm using standard time-domain waveforms, processing each event in less than one hour on recent computing hardware. For most events we evaluate the marginalized likelihood (evidence) with statistical errors of ≲5 %, and even smaller in many cases. With a bounded runtime independent of the waveform model starting frequency, a nearly unchanged strategy could estimate neutron star (NS)-NS parameters in the 2018 advanced LIGO era. Our algorithm is usable with any noise curve and existing time-domain model at any mass, including some waveforms which are computationally costly to evolve.
NASA Technical Reports Server (NTRS)
Bennett, C. L.; Kogut, A.; Hinshaw, G.; Banday, A. J.; Wright, E. L.; Gorski, K. M.; Wilkinson, D. T.; Weiss, R.; Smoot, G. F.; Meyer, S. S.
1994-01-01
The first two years of Cosmic Background Explorer (COBE) Differential Microwave Radiometers (DMR) observations of the cosmic microwave background (CMB) anisotropy are analyzed and compared with our previously published first year results. The results are consistent, but the addition of the second year of data increases the precision and accuracy detected CMB temperature fluctuations. The 2 yr 53 GHz data are characterized by rms temperature fluctuations of (delta-T)(sub rms) (7 deg) = 44 +/- 7 micro-K and (delta-T)(sub rms) (10 deg) = 30.5 +/- 2.7 micro-K at 7 deg and 10 deg angular resolution, respectively. The 53 x 90 GHz cross-correlation amplitude at zero lag is C(0)(sup 1/2) = 36 +/- 5 micro-K (68% CL) for the unsmoothed (7 deg resolution) DMR data. We perform a likelihood analysis of the cross-correlation function, with Monte Carlo simulations to infer biases of the method, for a power-law model of initial density fluctuations, P(k) proportional to R(exp n). The Monte Carlo simulations indicate that derived estimates of n are biased by +0.11 +/- 0.01, while the subset of simulations with a low quadrupole (as observed) indicate a bias of +0.31+/- 0.04. Derived values for 68% confidence intervals are given corrected (and not corrected) for our estimated biases. Including the quadrupole anisotropy, the most likely quadrupole-normalized amplitude is Q(sub rms-PS) = 14.3(sup + 5.2 sub -3.3) micro-K (12.8(sup + 5.2 sub -3.3) micro-K0 with a spectral index n = 1.42(sup + 0.49 sub -0.55)(n = 1.53(sup + 0.49 sub -0.55). With n fixed to 1.0 the most likely amplitude is 18.2 +/- 11.5 micro-K (17.4 +/- 1.5 micro-K). The marginal likelihood of n is 1.42 +/- 0.37 (1.53 +/- 0.37). Excluding the quadrupole anisotropy, the most likely quadrupole-normalized amplitude is Q(sub rms-PS) = 17.4(sup + 7.5 sub -5.2) micro-K (15.8(sup + 7.5 sub -5.2) micro-K) with a spectral index n = 1.11(sup + 0.60 sub -0.55) (n = 1.22(sup + 0.60 sub -0.55). With n fixed to 1.0 the most likely amplitude is 18.6 +/- 1.6 micro-K (18.2 +/- 1.6 micro-K). The marginal likelihood of n is 1.11 +/- 0.40 (1.22 +/- 0.40). Our best estimate of the dipole from the 2 yr DMR data is 3.363 +/- 0.024 mK toward Galactic coordinates (l, b) = (264.4 deg +/- 0.2 deg, 48.1 deg +/- 0.4 deg), and our best estimate of the rms quadrupole amplitude in our sky is 6 +/- 3 micro-K (68% CL).
The recursive maximum likelihood proportion estimator: User's guide and test results
NASA Technical Reports Server (NTRS)
Vanrooy, D. L.
1976-01-01
Implementation of the recursive maximum likelihood proportion estimator is described. A user's guide to programs as they currently exist on the IBM 360/67 at LARS, Purdue is included, and test results on LANDSAT data are described. On Hill County data, the algorithm yields results comparable to the standard maximum likelihood proportion estimator.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
NASA Astrophysics Data System (ADS)
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Tang, Hongxiu; Cai, Weibin; Wang, Hongjing; Zhang, Qing; Qian, Ling; Shell, Duane F.; Newman, Ian M.; Yin, Ping
2013-01-01
Objectives This study examines the association between cultural orientation and drinking behaviors among university students. Cultural orientation is the measure of how the cultural values of individuals living in their own society are influenced by cultural values introduced from the outside. Methods In 2011, a cross-sectional survey collected data from 1279 university students from six universities in central China. Participants used a likert scale to rank a series of statements reflecting cultural values from the previously validated Chinese Cultural Orientation Scale and answered questions about their drinking behaviors and socio-demographic characteristics. Results Statistically significant differences in cultural orientation were observed for gender, hometown and type of university attendance. Traditional-oriented students were more likely to be occasional drinkers or nondrinkers, while marginal-oriented students, bicultural-oriented students and western-oriented students were more likely to be regular drinkers. Bicultural orientation (OR = 1.80, P<0.05) and marginal orientation (OR = 1.64, P<0.05) increased the likelihood of the student being regular drinking, compared to students with traditional orientations. Males (OR = 4.40, P<0.05) had a higher likelihood of regular drinking than females, graduate students (OR = 2.59, P<0.05) had a higher likelihood of regular drinking than undergraduates, students from urban areas (OR = 1.79, P<0.05) had a higher likelihood of regular drinking than those from towns/rural areas, and students attending key universities (OR = 0.48, P<0.05) had a lower likelihood of regular drinking than those attending general universities. Conclusions Cultural orientation influences drinking behaviors. Traditional cultural orientation was associated with less drinking while western cultural orientation, marginal cultural orientation and bicultural orientation were associated with more drinking. The role of gender, hometown and university attendance is partially moderated through the influence of cultural orientation. The relationship between a traditional cultural orientation and alcohol drinking suggests that traditional Chinese cultural values should be examined for their role in possibly reducing alcohol-related risks through education and policy initiatives. PMID:23359611
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
NASA Astrophysics Data System (ADS)
Ferdous, Nazneen; Bhat, Chandra R.
2013-01-01
This paper proposes and estimates a spatial panel ordered-response probit model with temporal autoregressive error terms to analyze changes in urban land development intensity levels over time. Such a model structure maintains a close linkage between the land owner's decision (unobserved to the analyst) and the land development intensity level (observed by the analyst) and accommodates spatial interactions between land owners that lead to spatial spillover effects. In addition, the model structure incorporates spatial heterogeneity as well as spatial heteroscedasticity. The resulting model is estimated using a composite marginal likelihood (CML) approach that does not require any simulation machinery and that can be applied to data sets of any size. A simulation exercise indicates that the CML approach recovers the model parameters very well, even in the presence of high spatial and temporal dependence. In addition, the simulation results demonstrate that ignoring spatial dependency and spatial heterogeneity when both are actually present will lead to bias in parameter estimation. A demonstration exercise applies the proposed model to examine urban land development intensity levels using parcel-level data from Austin, Texas.
Interpretable inference on the mixed effect model with the Box-Cox transformation.
Maruo, K; Yamaguchi, Y; Noma, H; Gosho, M
2017-07-10
We derived results for inference on parameters of the marginal model of the mixed effect model with the Box-Cox transformation based on the asymptotic theory approach. We also provided a robust variance estimator of the maximum likelihood estimator of the parameters of this model in consideration of the model misspecifications. Using these results, we developed an inference procedure for the difference of the model median between treatment groups at the specified occasion in the context of mixed effects models for repeated measures analysis for randomized clinical trials, which provided interpretable estimates of the treatment effect. From simulation studies, it was shown that our proposed method controlled type I error of the statistical test for the model median difference in almost all the situations and had moderate or high performance for power compared with the existing methods. We illustrated our method with cluster of differentiation 4 (CD4) data in an AIDS clinical trial, where the interpretability of the analysis results based on our proposed method is demonstrated. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A general class of multinomial mixture models for anuran calling survey data
Royle, J. Andrew; Link, W.A.
2005-01-01
We propose a general framework for modeling anuran abundance using data collected from commonly used calling surveys. The data generated from calling surveys are indices of calling intensity (vocalization of males) that do not have a precise link to actual population size and are sensitive to factors that influence anuran behavior. We formulate a model for calling-index data in terms of the maximum potential calling index that could be observed at a site (the 'latent abundance class'), given its underlying breeding population, and we focus attention on estimating the distribution of this latent abundance class. A critical consideration in estimating the latent structure is imperfect detection, which causes the observed abundance index to be less than or equal to the latent abundance class. We specify a multinomial sampling model for the observed abundance index that is conditional on the latent abundance class. Estimation of the latent abundance class distribution is based on the marginal likelihood of the index data, having integrated over the latent class distribution. We apply the proposed modeling framework to data collected as part of the North American Amphibian Monitoring Program (NAAMP).
Chen, Rui; Hyrien, Ollivier
2011-01-01
This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15
ERIC Educational Resources Information Center
Zhang, Jinming
2005-01-01
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
NASA Astrophysics Data System (ADS)
Shi, Lei; Guo, Lianghui; Ma, Yawei; Li, Yonghua; Wang, Weilai
2018-05-01
The technique of teleseismic receiver function H-κ stacking is popular for estimating the crustal thickness and Vp/Vs ratio. However, it has large uncertainty or ambiguity when the Moho multiples in receiver function are not easy to be identified. We present an improved technique to estimate the crustal thickness and Vp/Vs ratio by joint constraints of receiver function and gravity data. The complete Bouguer gravity anomalies, composed of the anomalies due to the relief of the Moho interface and the heterogeneous density distribution within the crust, are associated with the crustal thickness, density and Vp/Vs ratio. According to their relationship formulae presented by Lowry and Pérez-Gussinyé, we invert the complete Bouguer gravity anomalies by using a common algorithm of likelihood estimation to obtain the crustal thickness and Vp/Vs ratio, and then utilize them to constrain the receiver function H-κ stacking result. We verified the improved technique on three synthetic crustal models and evaluated the influence of selected parameters, the results of which demonstrated that the novel technique could reduce the ambiguity and enhance the accuracy of estimation. Real data test at two given stations in the NE margin of Tibetan Plateau illustrated that the improved technique provided reliable estimations of crustal thickness and Vp/Vs ratio.
An improved approximate-Bayesian model-choice method for estimating shared evolutionary history
2014-01-01
Background To understand biological diversification, it is important to account for large-scale processes that affect the evolutionary history of groups of co-distributed populations of organisms. Such events predict temporally clustered divergences times, a pattern that can be estimated using genetic data from co-distributed species. I introduce a new approximate-Bayesian method for comparative phylogeographical model-choice that estimates the temporal distribution of divergences across taxa from multi-locus DNA sequence data. The model is an extension of that implemented in msBayes. Results By reparameterizing the model, introducing more flexible priors on demographic and divergence-time parameters, and implementing a non-parametric Dirichlet-process prior over divergence models, I improved the robustness, accuracy, and power of the method for estimating shared evolutionary history across taxa. Conclusions The results demonstrate the improved performance of the new method is due to (1) more appropriate priors on divergence-time and demographic parameters that avoid prohibitively small marginal likelihoods for models with more divergence events, and (2) the Dirichlet-process providing a flexible prior on divergence histories that does not strongly disfavor models with intermediate numbers of divergence events. The new method yields more robust estimates of posterior uncertainty, and thus greatly reduces the tendency to incorrectly estimate models of shared evolutionary history with strong support. PMID:24992937
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
Marginal regression approach for additive hazards models with clustered current status data.
Su, Pei-Fang; Chi, Yunchan
2014-01-15
Current status data arise naturally from tumorigenicity experiments, epidemiology studies, biomedicine, econometrics and demographic and sociology studies. Moreover, clustered current status data may occur with animals from the same litter in tumorigenicity experiments or with subjects from the same family in epidemiology studies. Because the only information extracted from current status data is whether the survival times are before or after the monitoring or censoring times, the nonparametric maximum likelihood estimator of survival function converges at a rate of n(1/3) to a complicated limiting distribution. Hence, semiparametric regression models such as the additive hazards model have been extended for independent current status data to derive the test statistics, whose distributions converge at a rate of n(1/2) , for testing the regression parameters. However, a straightforward application of these statistical methods to clustered current status data is not appropriate because intracluster correlation needs to be taken into account. Therefore, this paper proposes two estimating functions for estimating the parameters in the additive hazards model for clustered current status data. The comparative results from simulation studies are presented, and the application of the proposed estimating functions to one real data set is illustrated. Copyright © 2013 John Wiley & Sons, Ltd.
Education and black-white interracial marriage.
Gullickson, Aaron
2006-11-01
This article examines competing theoretical claims regarding how an individual's education will affect his or her likelihood of interracial marriage. I demonstrate that prior models of interracial marriage have failed to adequately distinguish the joint and marginal effects of education on interracial marriage and present a model capable of distinguishing these effects. I test this model on black-white interracial marriages using 1980, 1990, and 2000 U.S. census data. The results reveal partial support for status exchange theory within black male-white female unions and strong isolation of lower-class blacks from the interracial marriage market. Structural assimilation theory is not supported because the educational attainment of whites is not related in any consistent fashion to the likelihood of interracial marriage. The strong isolation of lower-class blacks from the interracial marriage market has gone unnoticed in prior research because of the failure of prior methods to distinguish joint and marginal effects.
Adaptive power priors with empirical Bayes for clinical trials.
Gravestock, Isaac; Held, Leonhard
2017-09-01
Incorporating historical information into the design and analysis of a new clinical trial has been the subject of much discussion as a way to increase the feasibility of trials in situations where patients are difficult to recruit. The best method to include this data is not yet clear, especially in the case when few historical studies are available. This paper looks at the power prior technique afresh in a binomial setting and examines some previously unexamined properties, such as Box P values, bias, and coverage. Additionally, it proposes an empirical Bayes-type approach to estimating the prior weight parameter by marginal likelihood. This estimate has advantages over previously criticised methods in that it varies commensurably with differences in the historical and current data and can choose weights near 1 when the data are similar enough. Fully Bayesian approaches are also considered. An analysis of the operating characteristics shows that the adaptive methods work well and that the various approaches have different strengths and weaknesses. Copyright © 2017 John Wiley & Sons, Ltd.
An alternative method to measure the likelihood of a financial crisis in an emerging market
NASA Astrophysics Data System (ADS)
Özlale, Ümit; Metin-Özcan, Kıvılcım
2007-07-01
This paper utilizes an early warning system in order to measure the likelihood of a financial crisis in an emerging market economy. We introduce a methodology, where we can both obtain a likelihood series and analyze the time-varying effects of several macroeconomic variables on this likelihood. Since the issue is analyzed in a non-linear state space framework, the extended Kalman filter emerges as the optimal estimation algorithm. Taking the Turkish economy as our laboratory, the results indicate that both the derived likelihood measure and the estimated time-varying parameters are meaningful and can successfully explain the path that the Turkish economy had followed between 2000 and 2006. The estimated parameters also suggest that overvalued domestic currency, current account deficit and the increase in the default risk increase the likelihood of having an economic crisis in the economy. Overall, the findings in this paper suggest that the estimation methodology introduced in this paper can also be applied to other emerging market economies as well.
Maximum likelihood estimation of signal-to-noise ratio and combiner weight
NASA Technical Reports Server (NTRS)
Kalson, S.; Dolinar, S. J.
1986-01-01
An algorithm for estimating signal to noise ratio and combiner weight parameters for a discrete time series is presented. The algorithm is based upon the joint maximum likelihood estimate of the signal and noise power. The discrete-time series are the sufficient statistics obtained after matched filtering of a biphase modulated signal in additive white Gaussian noise, before maximum likelihood decoding is performed.
Changren Weng; Thomas L. Kubisiak; C. Dana Nelson; James P. Geaghan; Michael Stine
1999-01-01
Single marker regression and single marker maximum likelihood estimation were tied to detect quantitative trait loci (QTLs) controlling the early height growth of longleaf pine and slash pine using a ((longleaf pine x slash pine) x slash pine) BC, population consisting of 83 progeny. Maximum likelihood estimation was found to be more power than regression and could...
Investigating the Impact of Uncertainty about Item Parameters on Ability Estimation
ERIC Educational Resources Information Center
Zhang, Jinming; Xie, Minge; Song, Xiaolan; Lu, Ting
2011-01-01
Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimator (WLE) of an examinee's ability are derived while item parameter estimators are treated as covariates measured with error. The asymptotic formulae present the amount of bias of the ability estimators due to the uncertainty of item parameter estimators.…
Racial disparities in prescription drug use for mental illness among population in US.
Han, Euna; Liu, Gordon G
2005-09-01
Racial minorities are a rapidly growing portion of the US population. Research suggests that racial minorities are more vulnerable to mental illness due to risk factors, such as higher rates of poverty. Given that the burden of mental illnesses is significant, equal likelihood of mental health services utilization is important to reduce such burden. Racial minorities have been known to use mental health services less than Whites. However, it is unclear whether racial disparity in prescription drug use for mental illnesses exists in a nationally representative sample. For a valid estimation of prescription drug use patterns, the characteristic in the distribution of prescription drug use should be accounted for in the estimation model. This study is intended to document whether there was a disparity in psychiatric drug use in both extensive and intensive margins between Whites and three racial minorities: Blacks, Hispanics, and Asian-Indians. The study looked at several specified mental illnesses, controlling for underlying health status and other confounding factors. Secondary data analysis was conducted using the multiyear Medical Expenditure Panel Survey (MEPS), a nationally representative panel sample from 1996 through 2000. This analysis provides estimates of the actual expenditure on prescription drug use for people with specified mental illnesses for this study, based on comparison of Whites and other racial minorities. We derived the estimates from the two-part model, a framework that adjusts the likelihood of using prescription drugs for the specified mental illnesses while estimating the total actual expenditures on prescription drugs among the users. This study found that Blacks, Hispanics, and Asian-Indians were less likely than Whites to use prescription drugs by 8.3, 6.1 and 23.6 percentage points, respectively, holding other factors constant in the sample, with at least one of the specified mental illnesses. The expenditure on prescription drugs for the specified mental illnesses differs between each of racial minorities (Blacks, Hispanics, and Asian-Indians) and Whites even after adjusting for the different likelihood of using those prescription drugs. Blacks, Hispanics, and Asian-Indians with the specified mental illnesses were estimated to spend 606.53 US dollars, 9.83 US dollars and 179.60 US dollars less per year, respectively, on their actual prescription drugs than Whites. This study concludes that three racial minorities: Blacks, Hispanics, and Asian-Indians, with the specified mental illnesses are less likely to use psychiatric drugs than Whites. Among users, racial minorities use less psychiatric drugs than Whites in terms of actual spending on those drugs. There is a need to focus on a program to reach out to racial minorities with a diagnosis of mental illnesses, and this program should consider the cultural specificity of each minority group regarding mental illnesses. In the development of mental health policy, it is crucial to understand the underlying non-socioeconomic factors which may significantly determine the access to mental health service. Also, education programs or other outreach programs for racial minorities are necessary to understand the different distribution of mental health services for racial minorities. Future research should examine the causes for racial disparity in the use of prescription drugs for mental illness both in the extensive and intensive margins. An in-depth analysis is needed to map out the attributes for the observed disparity between Whites and racial minorities in mental health service use.
Maximum likelihood estimation of finite mixture model for economic data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Maximum Likelihood Estimation with Emphasis on Aircraft Flight Data
NASA Technical Reports Server (NTRS)
Iliff, K. W.; Maine, R. E.
1985-01-01
Accurate modeling of flexible space structures is an important field that is currently under investigation. Parameter estimation, using methods such as maximum likelihood, is one of the ways that the model can be improved. The maximum likelihood estimator has been used to extract stability and control derivatives from flight data for many years. Most of the literature on aircraft estimation concentrates on new developments and applications, assuming familiarity with basic estimation concepts. Some of these basic concepts are presented. The maximum likelihood estimator and the aircraft equations of motion that the estimator uses are briefly discussed. The basic concepts of minimization and estimation are examined for a simple computed aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to help illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Specific examples of estimation of structural dynamics are included. Some of the major conclusions for the computed example are also developed for the analysis of flight data.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
New results and insights concerning a previously published iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions were discussed. It was shown that the procedure converges locally to the consistent maximum likelihood estimate as long as a specified parameter is bounded between two limits. Bound values were given to yield optimal local convergence.
Parameter identifiability and regional calibration for reservoir inflow prediction
NASA Astrophysics Data System (ADS)
Kolberg, Sjur; Engeland, Kolbjørn; Tøfte, Lena S.; Bruland, Oddbjørn
2013-04-01
The large hydropower producer Statkraft is currently testing regional, distributed models for operational reservoir inflow prediction. The need for simultaneous forecasts and consistent updating in a large number of catchments supports the shift from catchment-oriented to regional models. Low-quality naturalized inflow series in the reservoir catchments further encourages the use of donor catchments and regional simulation for calibration purposes. MCMC based parameter estimation (the Dream algorithm; Vrugt et al, 2009) is adapted to regional parameter estimation, and implemented within the open source ENKI framework. The likelihood is based on the concept of effectively independent number of observations, spatially as well as in time. Marginal and conditional (around an optimum) parameter distributions for each catchment may be extracted, even though the MCMC algorithm itself is guided only by the regional likelihood surface. Early results indicate that the average performance loss associated with regional calibration (difference in Nash-Sutcliffe R2 between regionally and locally optimal parameters) is in the range of 0.06. The importance of the seasonal snow storage and melt in Norwegian mountain catchments probably contributes to the high degree of similarity among catchments. The evaluation continues for several regions, focusing on posterior parameter uncertainty and identifiability. Vrugt, J. A., C. J. F. ter Braak, C. G. H. Diks, B. A. Robinson, J. M. Hyman and D. Higdon: Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling. Int. J. of nonlinear sciences and numerical simulation 10, 3, 273-290, 2009.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2012-01-01
This paper focuses on two estimators of ability with logistic item response theory models: the Bayesian modal (BM) estimator and the weighted likelihood (WL) estimator. For the BM estimator, Jeffreys' prior distribution is considered, and the corresponding estimator is referred to as the Jeffreys modal (JM) estimator. It is established that under…
Extreme data compression for the CMB
NASA Astrophysics Data System (ADS)
Zablocki, Alan; Dodelson, Scott
2016-04-01
We apply the Karhunen-Loéve methods to cosmic microwave background (CMB) data sets, and show that we can recover the input cosmology and obtain the marginalized likelihoods in Λ cold dark matter cosmologies in under a minute, much faster than Markov chain Monte Carlo methods. This is achieved by forming a linear combination of the power spectra at each multipole l , and solving a system of simultaneous equations such that the Fisher matrix is locally unchanged. Instead of carrying out a full likelihood evaluation over the whole parameter space, we need evaluate the likelihood only for the parameter of interest, with the data compression effectively marginalizing over all other parameters. The weighting vectors contain insight about the physical effects of the parameters on the CMB anisotropy power spectrum Cl . The shape and amplitude of these vectors give an intuitive feel for the physics of the CMB, the sensitivity of the observed spectrum to cosmological parameters, and the relative sensitivity of different experiments to cosmological parameters. We test this method on exact theory Cl as well as on a Wilkinson Microwave Anisotropy Probe (WMAP)-like CMB data set generated from a random realization of a fiducial cosmology, comparing the compression results to those from a full likelihood analysis using CosmoMC. After showing that the method works, we apply it to the temperature power spectrum from the WMAP seven-year data release, and discuss the successes and limitations of our method as applied to a real data set.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
The costs of turnover in nursing homes
Mukamel, Dana B.; Spector, William D.; Limcangco, Rhona; Wang, Ying; Feng, Zhanlian; Mor, Vincent
2009-01-01
Background Turnover rates in nursing homes have been persistently high for decades, ranging upwards of 100%. Objectives To estimate the net costs associated with turnover of direct care staff in nursing homes. Data and sample 902 nursing homes in California in 2005. Data included Medicaid cost reports, the Minimum Data Set (MDS), Medicare enrollment files, Census and Area Resource File (ARF). Research Design We estimated total cost functions, which included in addition to exogenous outputs and wages, the facility turnover rate. Instrumental variable (IV) limited information maximum likelihood techniques were used for estimation to deal with the endogeneity of turnover and costs. Results The cost functions exhibited the expected behavior, with initially increasing and then decreasing returns to scale. The ordinary least square estimate did not show a significant association between costs and turnover. The IV estimate of turnover costs was negative and significant (p=0.039). The marginal cost savings associated with a 10 percentage point increase in turnover for an average facility was $167,063 or 2.9% of annual total costs. Conclusion The net savings associated with turnover offer an explanation for the persistence of this phenomenon over the last decades, despite the many policy initiatives to reduce it. Future policy efforts need to recognize the complex relationship between turnover and costs. PMID:19648834
Multiple-hit parameter estimation in monolithic detectors.
Hunter, William C J; Barrett, Harrison H; Lewellen, Tom K; Miyaoka, Robert S
2013-02-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%-12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied.
An evaluation of percentile and maximum likelihood estimators of weibull paremeters
Stanley J. Zarnoch; Tommy R. Dell
1985-01-01
Two methods of estimating the three-parameter Weibull distribution were evaluated by computer simulation and field data comparison. Maximum likelihood estimators (MLB) with bias correction were calculated with the computer routine FITTER (Bailey 1974); percentile estimators (PCT) were those proposed by Zanakis (1979). The MLB estimators had superior smaller bias and...
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.
ERIC Educational Resources Information Center
Blackwood, Larry G.; Bradley, Edwin L.
1989-01-01
Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
ERIC Educational Resources Information Center
Yang, Xiangdong; Poggio, John C.; Glasnapp, Douglas R.
2006-01-01
The effects of five ability estimators, that is, maximum likelihood estimator, weighted likelihood estimator, maximum a posteriori, expected a posteriori, and Owen's sequential estimator, on the performances of the item response theory-based adaptive classification procedure on multiple categories were studied via simulations. The following…
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Cosmology with the largest galaxy cluster surveys: going beyond Fisher matrix forecasts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khedekar, Satej; Majumdar, Subhabrata, E-mail: satej@mpa-garching.mpg.de, E-mail: subha@tifr.res.in
2013-02-01
We make the first detailed MCMC likelihood study of cosmological constraints that are expected from some of the largest, ongoing and proposed, cluster surveys in different wave-bands and compare the estimates to the prevalent Fisher matrix forecasts. Mock catalogs of cluster counts expected from the surveys — eROSITA, WFXT, RCS2, DES and Planck, along with a mock dataset of follow-up mass calibrations are analyzed for this purpose. A fair agreement between MCMC and Fisher results is found only in the case of minimal models. However, for many cases, the marginalized constraints obtained from Fisher and MCMC methods can differ bymore » factors of 30-100%. The discrepancy can be alarmingly large for a time dependent dark energy equation of state, w(a); the Fisher methods are seen to under-estimate the constraints by as much as a factor of 4-5. Typically, Fisher estimates become more and more inappropriate as we move away from ΛCDM, to a constant-w dark energy to varying-w dark energy cosmologies. Fisher analysis, also, predicts incorrect parameter degeneracies. There are noticeable offsets in the likelihood contours obtained from Fisher methods that is caused due to an asymmetry in the posterior likelihood distribution as seen through a MCMC analysis. From the point of mass-calibration uncertainties, a high value of unknown scatter about the mean mass-observable relation, and its redshift dependence, is seen to have large degeneracies with the cosmological parameters σ{sub 8} and w(a) and can degrade the cosmological constraints considerably. We find that the addition of mass-calibrated cluster datasets can improve dark energy and σ{sub 8} constraints by factors of 2-3 from what can be obtained from CMB+SNe+BAO only . Finally, we show that a joint analysis of datasets of two (or more) different cluster surveys would significantly tighten cosmological constraints from using clusters only. Since, details of future cluster surveys are still being planned, we emphasize that optimal survey design must be done using MCMC analysis rather than Fisher forecasting.« less
Ohlsson, Claes; Nethander, Maria; Karlsson, Magnus K; Rosengren, Björn E; Ribom, Eva; Mellström, Dan; Vandenput, Liesbeth
2018-03-12
The adrenal-derived hormones dehydroepiandrosterone (DHEA) and its sulfate (DHEAS) are the most abundant circulating hormones and their levels decline substantially with age. Many of the actions of DHEAS are considered to be mediated through metabolism into androgens and estrogens in peripheral target tissues. The predictive value of serum DHEA and DHEAS for the likelihood of falling is unknown. The aim of this study was, therefore, to assess the associations between baseline DHEA and DHEAS levels and incident fall risk in a large cohort of older men. Serum DHEA and DHEAS levels were analyzed with mass spectrometry in the population-based Osteoporotic Fractures in Men study in Sweden (n = 2516, age 69 to 81 years). Falls were ascertained every 4 months by mailed questionnaires. Associations between steroid hormones and falls were estimated by generalized estimating equations. During a mean follow-up of 2.7 years, 968 (38.5%) participants experienced a fall. High serum levels of both DHEA (odds ratio [OR] per SD increase 0.85; 95% CI, 0.78 to 0.92) and DHEAS (OR 0.88, 95% CI, 0.81 to 0.95) were associated with a lower incident fall risk in models adjusted for age, BMI, and prevalent falls. Further adjustment for serum sex steroids or age-related comorbidities only marginally attenuated the associations between DHEA or DHEAS and the likelihood of falling. Moreover, the point estimates for DHEA and DHEAS were only slightly reduced after adjustment for lean mass and/or grip strength. Also, the addition of the narrow walk test did not substantially alter the associations between serum DHEA or DHEAS and fall risk. Finally, the association with incident fall risk remained significant for DHEA but not for DHEAS after simultaneous adjustment for lean mass, grip strength, and the narrow walk test. This suggests that the associations between DHEA and DHEAS and falls are only partially mediated via muscle mass, muscle strength, and/or balance. In conclusion, older men with high DHEA or DHEAS levels have a lesser likelihood of a fall. © 2018 American Society for Bone and Mineral Research. © 2018 American Society for Bone and Mineral Research.
Estimating parameter of Rayleigh distribution by using Maximum Likelihood method and Bayes method
NASA Astrophysics Data System (ADS)
Ardianti, Fitri; Sutarman
2018-01-01
In this paper, we use Maximum Likelihood estimation and Bayes method under some risk function to estimate parameter of Rayleigh distribution to know the best method. The prior knowledge which used in Bayes method is Jeffrey’s non-informative prior. Maximum likelihood estimation and Bayes method under precautionary loss function, entropy loss function, loss function-L 1 will be compared. We compare these methods by bias and MSE value using R program. After that, the result will be displayed in tables to facilitate the comparisons.
Consistency of Rasch Model Parameter Estimation: A Simulation Study.
ERIC Educational Resources Information Center
van den Wollenberg, Arnold L.; And Others
1988-01-01
The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gibson, Eli; Biomedical Engineering, University of Western Ontario, London, Ontario; Centre for Medical Image Computing, University College London, London
Purpose: Defining prostate cancer (PCa) lesion clinical target volumes (CTVs) for multiparametric magnetic resonance imaging (mpMRI) could support focal boosting or treatment to improve outcomes or lower morbidity, necessitating appropriate CTV margins for mpMRI-defined gross tumor volumes (GTVs). This study aimed to identify CTV margins yielding 95% coverage of PCa tumors for prospective cases with high likelihood. Methods and Materials: Twenty-five men with biopsy-confirmed clinical stage T1 or T2 PCa underwent pre-prostatectomy mpMRI, yielding T2-weighted, dynamic contrast-enhanced, and apparent diffusion coefficient images. Digitized whole-mount histology was contoured and registered to mpMRI scans (error ≤2 mm). Four observers contoured lesion GTVs onmore » each mpMRI scan. CTVs were defined by isotropic and anisotropic expansion from these GTVs and from multiparametric (unioned) GTVs from 2 to 3 scans. Histologic coverage (proportions of tumor area on co-registered histology inside the CTV, measured for Gleason scores [GSs] ≥6 and ≥7) and prostate sparing (proportions of prostate volume outside the CTV) were measured. Nonparametric histologic-coverage prediction intervals defined minimal margins yielding 95% coverage for prospective cases with 78% to 92% likelihood. Results: On analysis of 72 true-positive tumor detections, 95% coverage margins were 9 to 11 mm (GS ≥ 6) and 8 to 10 mm (GS ≥ 7) for single-sequence GTVs and were 8 mm (GS ≥ 6) and 6 mm (GS ≥ 7) for 3-sequence GTVs, yielding CTVs that spared 47% to 81% of prostate tissue for the majority of tumors. Inclusion of T2-weighted contours increased sparing for multiparametric CTVs with 95% coverage margins for GS ≥6, and inclusion of dynamic contrast-enhanced contours increased sparing for GS ≥7. Anisotropic 95% coverage margins increased the sparing proportions to 71% to 86%. Conclusions: Multiparametric magnetic resonance imaging–defined GTVs expanded by appropriate margins may support focal boosting or treatment of PCa; however, these margins, accounting for interobserver and intertumoral variability, may preclude highly conformal CTVs. Multiparametric GTVs and anisotropic margins may reduce the required margins and improve prostate sparing.« less
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
2010-06-01
GMKPF represents a better and more flexible alternative to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ...accurate results relative to GML and EML when the network delays are modeled in terms of a single non-Gaussian/non-exponential distribution or as a...to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ) estimators for clock offset estimation in non-Gaussian or non
Maximum-likelihood estimation of parameterized wavefronts from multifocal data
Sakamoto, Julia A.; Barrett, Harrison H.
2012-01-01
A method for determining the pupil phase distribution of an optical system is demonstrated. Coefficients in a wavefront expansion were estimated using likelihood methods, where the data consisted of multiple irradiance patterns near focus. Proof-of-principle results were obtained in both simulation and experiment. Large-aberration wavefronts were handled in the numerical study. Experimentally, we discuss the handling of nuisance parameters. Fisher information matrices, Cramér-Rao bounds, and likelihood surfaces are examined. ML estimates were obtained by simulated annealing to deal with numerous local extrema in the likelihood function. Rapid processing techniques were employed to reduce the computational time. PMID:22772282
Tian, Guo-Liang; Li, Hui-Qiong
2017-08-01
Some existing confidence interval methods and hypothesis testing methods in the analysis of a contingency table with incomplete observations in both margins entirely depend on an underlying assumption that the sampling distribution of the observed counts is a product of independent multinomial/binomial distributions for complete and incomplete counts. However, it can be shown that this independency assumption is incorrect and can result in unreliable conclusions because of the under-estimation of the uncertainty. Therefore, the first objective of this paper is to derive the valid joint sampling distribution of the observed counts in a contingency table with incomplete observations in both margins. The second objective is to provide a new framework for analyzing incomplete contingency tables based on the derived joint sampling distribution of the observed counts by developing a Fisher scoring algorithm to calculate maximum likelihood estimates of parameters of interest, the bootstrap confidence interval methods, and the bootstrap testing hypothesis methods. We compare the differences between the valid sampling distribution and the sampling distribution under the independency assumption. Simulation studies showed that average/expected confidence-interval widths of parameters based on the sampling distribution under the independency assumption are shorter than those based on the new sampling distribution, yielding unrealistic results. A real data set is analyzed to illustrate the application of the new sampling distribution for incomplete contingency tables and the analysis results again confirm the conclusions obtained from the simulation studies.
A comparison of abundance estimates from extended batch-marking and Jolly–Seber-type experiments
Cowen, Laura L E; Besbeas, Panagiotis; Morgan, Byron J T; Schwarz, Carl J
2014-01-01
Little attention has been paid to the use of multi-sample batch-marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. (2010) present a pseudo-likelihood for a multi-sample batch-marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson-type estimator. We have developed and maximized the likelihood for batch-marking studies. We use data simulated from a Jolly–Seber-type study and convert this to what would have been obtained from an extended batch-marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch-marking model to determine the efficiency of collecting and analyzing batch-marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo-likelihood method of Huggins et al. (2010). When faced with designing a batch-marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size. PMID:24558576
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eifler, Tim; Krause, Elisabeth; Dodelson, Scott
2014-05-28
Systematic uncertainties that have been subdominant in past large-scale structure (LSS) surveys are likely to exceed statistical uncertainties of current and future LSS data sets, potentially limiting the extraction of cosmological information. Here we present a general framework (PCA marginalization) to consistently incorporate systematic effects into a likelihood analysis. This technique naturally accounts for degeneracies between nuisance parameters and can substantially reduce the dimension of the parameter space that needs to be sampled. As a practical application, we apply PCA marginalization to account for baryonic physics as an uncertainty in cosmic shear tomography. Specifically, we use CosmoLike to run simulatedmore » likelihood analyses on three independent sets of numerical simulations, each covering a wide range of baryonic scenarios differing in cooling, star formation, and feedback mechanisms. We simulate a Stage III (Dark Energy Survey) and Stage IV (Large Synoptic Survey Telescope/Euclid) survey and find a substantial bias in cosmological constraints if baryonic physics is not accounted for. We then show that PCA marginalization (employing at most 3 to 4 nuisance parameters) removes this bias. Our study demonstrates that it is possible to obtain robust, precise constraints on the dark energy equation of state even in the presence of large levels of systematic uncertainty in astrophysical processes. We conclude that the PCA marginalization technique is a powerful, general tool for addressing many of the challenges facing the precision cosmology program.« less
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures
ERIC Educational Resources Information Center
Jeon, Minjeong; Rabe-Hesketh, Sophia
2012-01-01
In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
Multiple-Hit Parameter Estimation in Monolithic Detectors
Barrett, Harrison H.; Lewellen, Tom K.; Miyaoka, Robert S.
2014-01-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%–12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied. PMID:23193231
Profile-likelihood Confidence Intervals in Item Response Theory Models.
Chalmers, R Philip; Pek, Jolynn; Liu, Yang
2017-01-01
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data
ERIC Educational Resources Information Center
Savalei, Victoria
2010-01-01
Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
NASA Astrophysics Data System (ADS)
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Markov chain Monte Carlo estimation of quantum states
NASA Astrophysics Data System (ADS)
Diguglielmo, James; Messenger, Chris; Fiurášek, Jaromír; Hage, Boris; Samblowski, Aiko; Schmidt, Tabea; Schnabel, Roman
2009-03-01
We apply a Bayesian data analysis scheme known as the Markov chain Monte Carlo to the tomographic reconstruction of quantum states. This method yields a vector, known as the Markov chain, which contains the full statistical information concerning all reconstruction parameters including their statistical correlations with no a priori assumptions as to the form of the distribution from which it has been obtained. From this vector we can derive, e.g., the marginal distributions and uncertainties of all model parameters, and also of other quantities such as the purity of the reconstructed state. We demonstrate the utility of this scheme by reconstructing the Wigner function of phase-diffused squeezed states. These states possess non-Gaussian statistics and therefore represent a nontrivial case of tomographic reconstruction. We compare our results to those obtained through pure maximum-likelihood and Fisher information approaches.
Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.
Deng, Yangqing; Pan, Wei
2017-12-01
There is growing interest in testing genetic pleiotropy, which is when a single genetic variant influences multiple traits. Several methods have been proposed; however, these methods have some limitations. First, all the proposed methods are based on the use of individual-level genotype and phenotype data; in contrast, for logistical, and other, reasons, summary statistics of univariate SNP-trait associations are typically only available based on meta- or mega-analyzed large genome-wide association study (GWAS) data. Second, existing tests are based on marginal pleiotropy, which cannot distinguish between direct and indirect associations of a single genetic variant with multiple traits due to correlations among the traits. Hence, it is useful to consider conditional analysis, in which a subset of traits is adjusted for another subset of traits. For example, in spite of substantial lowering of low-density lipoprotein cholesterol (LDL) with statin therapy, some patients still maintain high residual cardiovascular risk, and, for these patients, it might be helpful to reduce their triglyceride (TG) level. For this purpose, in order to identify new therapeutic targets, it would be useful to identify genetic variants with pleiotropic effects on LDL and TG after adjusting the latter for LDL; otherwise, a pleiotropic effect of a genetic variant detected by a marginal model could simply be due to its association with LDL only, given the well-known correlation between the two types of lipids. Here, we develop a new pleiotropy testing procedure based only on GWAS summary statistics that can be applied for both marginal analysis and conditional analysis. Although the main technical development is based on published union-intersection testing methods, care is needed in specifying conditional models to avoid invalid statistical estimation and inference. In addition to the previously used likelihood ratio test, we also propose using generalized estimating equations under the working independence model for robust inference. We provide numerical examples based on both simulated and real data, including two large lipid GWAS summary association datasets based on ∼100,000 and ∼189,000 samples, respectively, to demonstrate the difference between marginal and conditional analyses, as well as the effectiveness of our new approach. Copyright © 2017 by the Genetics Society of America.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown
ERIC Educational Resources Information Center
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi
2014-01-01
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
NASA Astrophysics Data System (ADS)
Sutawanir
2015-12-01
Mortality tables play important role in actuarial studies such as life annuities, premium determination, premium reserve, valuation pension plan, pension funding. Some known mortality tables are CSO mortality table, Indonesian Mortality Table, Bowers mortality table, Japan Mortality table. For actuary applications some tables are constructed with different environment such as single decrement, double decrement, and multiple decrement. There exist two approaches in mortality table construction : mathematics approach and statistical approach. Distribution model and estimation theory are the statistical concepts that are used in mortality table construction. This article aims to discuss the statistical approach in mortality table construction. The distributional assumptions are uniform death distribution (UDD) and constant force (exponential). Moment estimation and maximum likelihood are used to estimate the mortality parameter. Moment estimation methods are easier to manipulate compared to maximum likelihood estimation (mle). However, the complete mortality data are not used in moment estimation method. Maximum likelihood exploited all available information in mortality estimation. Some mle equations are complicated and solved using numerical methods. The article focus on single decrement estimation using moment and maximum likelihood estimation. Some extension to double decrement will introduced. Simple dataset will be used to illustrated the mortality estimation, and mortality table.
NASA Technical Reports Server (NTRS)
Thadani, S. G.
1977-01-01
The Maximum Likelihood Estimation of Signature Transformation (MLEST) algorithm is used to obtain maximum likelihood estimates (MLE) of affine transformation. The algorithm has been evaluated for three sets of data: simulated (training and recognition segment pairs), consecutive-day (data gathered from Landsat images), and geographical-extension (large-area crop inventory experiment) data sets. For each set, MLEST signature extension runs were made to determine MLE values and the affine-transformed training segment signatures were used to classify the recognition segments. The classification results were used to estimate wheat proportions at 0 and 1% threshold values.
Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions
Li, Haoran; Xiong, Li; Jiang, Xiaoqian
2014-01-01
Differential privacy has recently emerged in private statistical data release as one of the strongest privacy guarantees. Most of the existing techniques that generate differentially private histograms or synthetic data only work well for single dimensional or low-dimensional histograms. They become problematic for high dimensional and large domain data due to increased perturbation error and computation complexity. In this paper, we propose DPCopula, a differentially private data synthesization technique using Copula functions for multi-dimensional data. The core of our method is to compute a differentially private copula function from which we can sample synthetic data. Copula functions are used to describe the dependence between multivariate random vectors and allow us to build the multivariate joint distribution using one-dimensional marginal distributions. We present two methods for estimating the parameters of the copula functions with differential privacy: maximum likelihood estimation and Kendall’s τ estimation. We present formal proofs for the privacy guarantee as well as the convergence property of our methods. Extensive experiments using both real datasets and synthetic datasets demonstrate that DPCopula generates highly accurate synthetic multi-dimensional data with significantly better utility than state-of-the-art techniques. PMID:25405241
ERIC Educational Resources Information Center
Jones, Douglas H.
The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…
Data Series Subtraction with Unknown and Unmodeled Background Noise
NASA Technical Reports Server (NTRS)
Vitale, Stefano; Congedo, Giuseppe; Dolesi, Rita; Ferroni, Valerio; Hueller, Mauro; Vetrugno, Daniele; Weber, William Joseph; Audley, Heather; Danzmann, Karsten; Diepholz, Ingo;
2014-01-01
LISA Pathfinder (LPF), the precursor mission to a gravitational wave observatory of the European Space Agency, will measure the degree to which two test masses can be put into free fall, aiming to demonstrate a suppression of disturbance forces corresponding to a residual relative acceleration with a power spectral density (PSD) below (30 fm/sq s/Hz)(sup 2) around 1 mHz. In LPF data analysis, the disturbance forces are obtained as the difference between the acceleration data and a linear combination of other measured data series. In many circumstances, the coefficients for this linear combination are obtained by fitting these data series to the acceleration, and the disturbance forces appear then as the data series of the residuals of the fit. Thus the background noise or, more precisely, its PSD, whose knowledge is needed to build up the likelihood function in ordinary maximum likelihood fitting, is here unknown, and its estimate constitutes instead one of the goals of the fit. In this paper we present a fitting method that does not require the knowledge of the PSD of the background noise. The method is based on the analytical marginalization of the posterior parameter probability density with respect to the background noise PSD, and returns an estimate both for the fitting parameters and for the PSD. We show that both these estimates are unbiased, and that, when using averaged Welchs periodograms for the residuals, the estimate of the PSD is consistent, as its error tends to zero with the inverse square root of the number of averaged periodograms. Additionally, we find that the method is equivalent to some implementations of iteratively reweighted least-squares fitting. We have tested the method both on simulated data of known PSD and on data from several experiments performed with the LISA Pathfinder end-to-end mission simulator.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Computation of nonparametric convex hazard estimators via profile methods.
Jankowski, Hanna K; Wellner, Jon A
2009-05-01
This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females.
Extreme data compression for the CMB
Zablocki, Alan; Dodelson, Scott
2016-04-28
We apply the Karhunen-Loéve methods to cosmic microwave background (CMB) data sets, and show that we can recover the input cosmology and obtain the marginalized likelihoods in Λ cold dark matter cosmologies in under a minute, much faster than Markov chain Monte Carlo methods. This is achieved by forming a linear combination of the power spectra at each multipole l, and solving a system of simultaneous equations such that the Fisher matrix is locally unchanged. Instead of carrying out a full likelihood evaluation over the whole parameter space, we need evaluate the likelihood only for the parameter of interest, with themore » data compression effectively marginalizing over all other parameters. The weighting vectors contain insight about the physical effects of the parameters on the CMB anisotropy power spectrum C l. The shape and amplitude of these vectors give an intuitive feel for the physics of the CMB, the sensitivity of the observed spectrum to cosmological parameters, and the relative sensitivity of different experiments to cosmological parameters. We test this method on exact theory C l as well as on a Wilkinson Microwave Anisotropy Probe (WMAP)-like CMB data set generated from a random realization of a fiducial cosmology, comparing the compression results to those from a full likelihood analysis using CosmoMC. Furthermore, after showing that the method works, we apply it to the temperature power spectrum from the WMAP seven-year data release, and discuss the successes and limitations of our method as applied to a real data set.« less
ERIC Educational Resources Information Center
Penfield, Randall D.; Bergeron, Jennifer M.
2005-01-01
This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
Rate of convergence of k-step Newton estimators to efficient likelihood estimators
Steve Verrill
2007-01-01
We make use of Cramer conditions together with the well-known local quadratic convergence of Newton?s method to establish the asymptotic closeness of k-step Newton estimators to efficient likelihood estimators. In Verrill and Johnson [2007. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. USDA Forest Products Laboratory Research...
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-06-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions.
An EM Algorithm for Maximum Likelihood Estimation of Process Factor Analysis Models
ERIC Educational Resources Information Center
Lee, Taehun
2010-01-01
In this dissertation, an Expectation-Maximization (EM) algorithm is developed and implemented to obtain maximum likelihood estimates of the parameters and the associated standard error estimates characterizing temporal flows for the latent variable time series following stationary vector ARMA processes, as well as the parameters defining the…
Information Foraging Across the Life Span: Search and Switch in Unknown Patches.
Chin, Jessie; Payne, Brennan R; Fu, Wai-Tat; Morrow, Daniel G; Stine-Morrow, Elizabeth A L
2015-07-01
In this study, we used a word search puzzle paradigm to investigate age differences in the rate of information gain (RG; i.e., word gain as a function of time) and the cues used to make patch-departure decisions in information foraging. The likelihood of patch departure increased as the profitability of the patch decreased generally. Both younger and older adults persisted past the point of optimality as defined by the marginal value theorem (Charnov, 1976), which assumes perfect knowledge of the foraging ecology. Nevertheless, there was evidence that adults were rational in terms of being sensitive to the change in RG for making the patch-departure decisions. However, given the limitations in cognitive resources and knowledge about the ecology, the estimation of RG may not be accurate. Younger adults were more likely to leave the puzzle as the long-term RG incrementally decreased, whereas older adults were more likely to leave the puzzle as the local RG decreased. However, older adults with better executive control were more likely to adjust their likelihood of patch-departure decisions to the long-term change in RG. Thus, age-dependent reliance on the long-term or local change in RG to make patch-departure decisions might be due to individual differences in executive control. Copyright © 2015 Cognitive Science Society, Inc.
Multiple robustness in factorized likelihood models.
Molina, J; Rotnitzky, A; Sued, M; Robins, J M
2017-09-01
We consider inference under a nonparametric or semiparametric model with likelihood that factorizes as the product of two or more variation-independent factors. We are interested in a finite-dimensional parameter that depends on only one of the likelihood factors and whose estimation requires the auxiliary estimation of one or several nuisance functions. We investigate general structures conducive to the construction of so-called multiply robust estimating functions, whose computation requires postulating several dimension-reducing models but which have mean zero at the true parameter value provided one of these models is correct.
A model for incomplete longitudinal multivariate ordinal data.
Liu, Li C
2008-12-30
In studies where multiple outcome items are repeatedly measured over time, missing data often occur. A longitudinal item response theory model is proposed for analysis of multivariate ordinal outcomes that are repeatedly measured. Under the MAR assumption, this model accommodates missing data at any level (missing item at any time point and/or missing time point). It allows for multiple random subject effects and the estimation of item discrimination parameters for the multiple outcome items. The covariates in the model can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is described utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher-scoring solution, which provides standard errors for all model parameters, is used. A data set from a longitudinal prevention study is used to motivate the application of the proposed model. In this study, multiple ordinal items of health behavior are repeatedly measured over time. Because of a planned missing design, subjects answered only two-third of all items at a given point. Copyright 2008 John Wiley & Sons, Ltd.
The Effect of Benefits, Premiums, and Health Risk on Health Plan Choice in the Medicare Program
Atherly, Adam; Dowd, Bryan E; Feldman, Roger
2004-01-01
Objective To estimate the effect of Medicare+Choice (M+C) plan premiums and benefits and individual beneficiary characteristics on the probability of enrollment in a Medicare+Choice plan. Data Source Individual data from the Medicare Current Beneficiary Survey were combined with plan-level data from Medicare Compare. Study Design Health plan choices, including the Medicare+Choice/Fee-for-Service decision and the choice of plan within the M+C sector, were modeled using limited information maximum likelihood nested logit. Principal Findings Premiums have a significant effect on plan selection, with an estimated out-of-pocket premium elasticity of −0.134 and an insurer-perspective elasticity of −4.57. Beneficiaries are responsive to plan characteristics, with prescription drug benefits having the largest marginal effect. Sicker beneficiaries were more likely to choose plans with drug benefits and diabetics were more likely to pick plans with vision coverage. Conclusions Plan characteristics significantly impact beneficiaries' decisions to enroll in Medicare M+C plans and individuals sort themselves systematically into plans based on individual characteristics. PMID:15230931
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.
Scheike, Thomas H; Juul, Anders
2004-04-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
Five Methods for Estimating Angoff Cut Scores with IRT
ERIC Educational Resources Information Center
Wyse, Adam E.
2017-01-01
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Sensitivity of Fit Indices to Misspecification in Growth Curve Models
ERIC Educational Resources Information Center
Wu, Wei; West, Stephen G.
2010-01-01
This study investigated the sensitivity of fit indices to model misspecification in within-individual covariance structure, between-individual covariance structure, and marginal mean structure in growth curve models. Five commonly used fit indices were examined, including the likelihood ratio test statistic, root mean square error of…
Brooks, Merrian J; Marshal, Michael P; McCauley, Heather L; Douaihy, Antoine; Miller, Elizabeth
2016-11-09
Hopefulness has been associated with increased treatment retention and reduced substance abuse among adults, and may be a promising modifiable factor to leverage in substance abuse treatment settings. Few studies have assessed the relationship between hopefulness and substance use in adolescents, particularly those with high-risk backgrounds. We explored whether high hope is associated with less likelihood for engaging in a variety of substance use behaviors in a sample of marginalized adolescents. Using logistic regression, we assessed results from a cross-sectional anonymous youth behavior survey (n = 256 youth, ages 14 to 19). We recruited from local youth serving agencies (e.g., homeless shelters, group homes, short-term detention). The sample was almost 60% male and two thirds African American. Unadjusted models showed youth with higher hope had a 50-58% (p = <.05) decreased odds of endorsing heavy episodic drinking, daily tobacco use, recent or lifetime marijuana use, and sex after using substances. Adjusted models showed a 52% decreased odds of lifetime marijuana use with higher hope, and a trend towards less sex after substance use (AOR 0.481; p = 0.065). No other substance use behaviors remained significantly associated with higher hope scores in adjusted models. Hopefulness may contribute to decreased likelihood of substance use in adolescents. Focusing on hope may be one modifiable target in a comprehensive primary or secondary substance use prevention program.
Estimating Marginal Returns to Education. NBER Working Paper No. 16474
ERIC Educational Resources Information Center
Carneiro, Pedro; Heckman, James J.; Vytlacil, Edward J.
2010-01-01
This paper estimates the marginal returns to college for individuals induced to enroll in college by different marginal policy changes. The recent instrumental variables literature seeks to estimate this parameter, but in general it does so only under strong assumptions that are tested and found wanting. We show how to utilize economic theory and…
Migration, Business Formation, and the Informal Economy in Urban Mexico
Riosmena, Fernando
2013-01-01
Although the informal economy has grown rapidly in several developing nations, and migration and informality may be related to similar types of credit constraints and market failures, previous research has not systematically attempted to identify if migrant households are more likely to start informal and formal businesses alike and if this association varies across local contexts. We examine the relationship between prior U.S. migration and the creation of both formal and informal businesses in urban Mexico using several criteria to indirectly assess sector location. We use data from 56 communities from the Mexican Migration Project to estimate multilevel survival and nonmultilevel competing risk models predicting the likelihood of informal, formal, and no business formation. The recent return migration of the household head is strongly associated with informal business creation, particularly in economically dynamic areas. On the other hand, migrants are only marginally more likely to start formal businesses in highly economically dynamic sending areas. PMID:23721676
Killiches, Matthias; Czado, Claudia
2018-03-22
We propose a model for unbalanced longitudinal data, where the univariate margins can be selected arbitrarily and the dependence structure is described with the help of a D-vine copula. We show that our approach is an extremely flexible extension of the widely used linear mixed model if the correlation is homogeneous over the considered individuals. As an alternative to joint maximum-likelihood a sequential estimation approach for the D-vine copula is provided and validated in a simulation study. The model can handle missing values without being forced to discard data. Since conditional distributions are known analytically, we easily make predictions for future events. For model selection, we adjust the Bayesian information criterion to our situation. In an application to heart surgery data our model performs clearly better than competing linear mixed models. © 2018, The International Biometric Society.
Tengher-Barna, Iulia; Hequet, Delphine; Reboul-Marty, Jeanne; Frassati-Biaggi, Annonciade; Seince, Nathalie; Rodrigues-Faure, Anabela; Uzan, Michèle; Ziol, Marianne
2009-02-01
Margin resection status is a major risk factor for the development of local recurrence in breast conservation therapy for carcinoma. Tumor bed excision sent as separate orientated cavity margins represents a tool to verify the completeness of the carcinoma resection. We aimed to (1) determine the prevalence of positive cavity margin and its influence on subsequent surgical treatment and (2) identify potential predictive factors for positive cavity margins. From 2003 to 2006, 107 (57 years; 30-88) consecutive patients who underwent a lumpectomy for carcinoma with four orientated cavity margins for carcinoma were selected. Preoperative clinical, radiological and histological data, perioperative macroscopic characteristics and definitive histological analysis results were recorded. Lumpectomy or cavity margins were considered as positive when the distance from carcinoma to the margin was less than or equal to 3 mm. Histological examination of cavity margins showed carcinoma in 38 patients (35%), therefore modifying subsequent surgical therapy in 33 cases. Examination of the cavity margins led (1) to avoiding surgical re-excision in 20 cases (lumpectomy margins were positive and the cavity margins negative), (2) to performing a mastectomy or a re-excision in 13 cases (carcinoma was detected in the cavity margins although the lumpectomy margins were negative or tumor size was superior to 3 cm). Between preoperative and perioperative parameters, US scan and macroscopic size of the tumor were predictive factors for positive cavity margins whereas characteristics of the carcinoma determined on biopsy samples and macroscopic status of the lumpectomy margins were not. Our study confirms that the systematic practice of cavity margin resection avoids surgical re-excision and reduces the likelihood of underestimating the extent of the tumor.
Bayesian multiple-source localization in an uncertain ocean environment.
Dosso, Stan E; Wilmut, Michael J
2011-06-01
This paper considers simultaneous localization of multiple acoustic sources when properties of the ocean environment (water column and seabed) are poorly known. A Bayesian formulation is developed in which the environmental parameters, noise statistics, and locations and complex strengths (amplitudes and phases) of multiple sources are considered to be unknown random variables constrained by acoustic data and prior information. Two approaches are considered for estimating source parameters. Focalization maximizes the posterior probability density (PPD) over all parameters using adaptive hybrid optimization. Marginalization integrates the PPD using efficient Markov-chain Monte Carlo methods to produce joint marginal probability distributions for source ranges and depths, from which source locations are obtained. This approach also provides quantitative uncertainty analysis for all parameters, which can aid in understanding of the inverse problem and may be of practical interest (e.g., source-strength probability distributions). In both approaches, closed-form maximum-likelihood expressions for source strengths and noise variance at each frequency allow these parameters to be sampled implicitly, substantially reducing the dimensionality and difficulty of the inversion. Examples are presented of both approaches applied to single- and multi-frequency localization of multiple sources in an uncertain shallow-water environment, and a Monte Carlo performance evaluation study is carried out. © 2011 Acoustical Society of America
Bayesian inference based on stationary Fokker-Planck sampling.
Berrones, Arturo
2010-06-01
A novel formalism for bayesian learning in the context of complex inference models is proposed. The method is based on the use of the stationary Fokker-Planck (SFP) approach to sample from the posterior density. Stationary Fokker-Planck sampling generalizes the Gibbs sampler algorithm for arbitrary and unknown conditional densities. By the SFP procedure, approximate analytical expressions for the conditionals and marginals of the posterior can be constructed. At each stage of SFP, the approximate conditionals are used to define a Gibbs sampling process, which is convergent to the full joint posterior. By the analytical marginals efficient learning methods in the context of artificial neural networks are outlined. Offline and incremental bayesian inference and maximum likelihood estimation from the posterior are performed in classification and regression examples. A comparison of SFP with other Monte Carlo strategies in the general problem of sampling from arbitrary densities is also presented. It is shown that SFP is able to jump large low-probability regions without the need of a careful tuning of any step-size parameter. In fact, the SFP method requires only a small set of meaningful parameters that can be selected following clear, problem-independent guidelines. The computation cost of SFP, measured in terms of loss function evaluations, grows linearly with the given model's dimension.
Comparing hierarchical models via the marginalized deviance information criterion.
Quintero, Adrian; Lesaffre, Emmanuel
2018-07-20
Hierarchical models are extensively used in pharmacokinetics and longitudinal studies. When the estimation is performed from a Bayesian approach, model comparison is often based on the deviance information criterion (DIC). In hierarchical models with latent variables, there are several versions of this statistic: the conditional DIC (cDIC) that incorporates the latent variables in the focus of the analysis and the marginalized DIC (mDIC) that integrates them out. Regardless of the asymptotic and coherency difficulties of cDIC, this alternative is usually used in Markov chain Monte Carlo (MCMC) methods for hierarchical models because of practical convenience. The mDIC criterion is more appropriate in most cases but requires integration of the likelihood, which is computationally demanding and not implemented in Bayesian software. Therefore, we consider a method to compute mDIC by generating replicate samples of the latent variables that need to be integrated out. This alternative can be easily conducted from the MCMC output of Bayesian packages and is widely applicable to hierarchical models in general. Additionally, we propose some approximations in order to reduce the computational complexity for large-sample situations. The method is illustrated with simulated data sets and 2 medical studies, evidencing that cDIC may be misleading whilst mDIC appears pertinent. Copyright © 2018 John Wiley & Sons, Ltd.
Dynamic frailty models based on compound birth-death processes.
Putter, Hein; van Houwelingen, Hans C
2015-07-01
Frailty models are used in survival analysis to model unobserved heterogeneity. They accommodate such heterogeneity by the inclusion of a random term, the frailty, which is assumed to multiply the hazard of a subject (individual frailty) or the hazards of all subjects in a cluster (shared frailty). Typically, the frailty term is assumed to be constant over time. This is a restrictive assumption and extensions to allow for time-varying or dynamic frailties are of interest. In this paper, we extend the auto-correlated frailty models of Henderson and Shimakura and of Fiocco, Putter and van Houwelingen, developed for longitudinal count data and discrete survival data, to continuous survival data. We present a rigorous construction of the frailty processes in continuous time based on compound birth-death processes. When the frailty processes are used as mixtures in models for survival data, we derive the marginal hazards and survival functions and the marginal bivariate survival functions and cross-ratio function. We derive distributional properties of the processes, conditional on observed data, and show how to obtain the maximum likelihood estimators of the parameters of the model using a (stochastic) expectation-maximization algorithm. The methods are applied to a publicly available data set. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Explaining the effect of event valence on unrealistic optimism.
Gold, Ron S; Brown, Mark G
2009-05-01
People typically exhibit 'unrealistic optimism' (UO): they believe they have a lower chance of experiencing negative events and a higher chance of experiencing positive events than does the average person. UO has been found to be greater for negative than positive events. This 'valence effect' has been explained in terms of motivational processes. An alternative explanation is provided by the 'numerosity model', which views the valence effect simply as a by-product of a tendency for likelihood estimates pertaining to the average member of a group to increase with the size of the group. Predictions made by the numerosity model were tested in two studies. In each, UO for a single event was assessed. In Study 1 (n = 115 students), valence was manipulated by framing the event either negatively or positively, and participants estimated their own likelihood and that of the average student at their university. In Study 2 (n = 139 students), valence was again manipulated and participants again estimated their own likelihood; additionally, group size was manipulated by having participants estimate the likelihood of the average student in a small, medium-sized, or large group. In each study, the valence effect was found, but was due to an effect on estimates of own likelihood, not the average person's likelihood. In Study 2, valence did not interact with group size. The findings contradict the numerosity model, but are in accord with the motivational explanation. Implications for health education are discussed.
Nagelkerke, Nico; Fidler, Vaclav
2015-01-01
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
NASA Technical Reports Server (NTRS)
Cash, W.
1979-01-01
Many problems in the experimental estimation of parameters for models can be solved through use of the likelihood ratio test. Applications of the likelihood ratio, with particular attention to photon counting experiments, are discussed. The procedures presented solve a greater range of problems than those currently in use, yet are no more difficult to apply. The procedures are proved analytically, and examples from current problems in astronomy are discussed.
Simulating the effect of non-linear mode coupling in cosmological parameter estimation
NASA Astrophysics Data System (ADS)
Kiessling, A.; Taylor, A. N.; Heavens, A. F.
2011-09-01
Fisher Information Matrix methods are commonly used in cosmology to estimate the accuracy that cosmological parameters can be measured with a given experiment and to optimize the design of experiments. However, the standard approach usually assumes both data and parameter estimates are Gaussian-distributed. Further, for survey forecasts and optimization it is usually assumed that the power-spectrum covariance matrix is diagonal in Fourier space. However, in the low-redshift Universe, non-linear mode coupling will tend to correlate small-scale power, moving information from lower to higher order moments of the field. This movement of information will change the predictions of cosmological parameter accuracy. In this paper we quantify this loss of information by comparing naïve Gaussian Fisher matrix forecasts with a maximum likelihood parameter estimation analysis of a suite of mock weak lensing catalogues derived from N-body simulations, based on the SUNGLASS pipeline, for a 2D and tomographic shear analysis of a Euclid-like survey. In both cases, we find that the 68 per cent confidence area of the Ωm-σ8 plane increases by a factor of 5. However, the marginal errors increase by just 20-40 per cent. We propose a new method to model the effects of non-linear shear-power mode coupling in the Fisher matrix by approximating the shear-power distribution as a multivariate Gaussian with a covariance matrix derived from the mock weak lensing survey. We find that this approximation can reproduce the 68 per cent confidence regions of the full maximum likelihood analysis in the Ωm-σ8 plane to high accuracy for both 2D and tomographic weak lensing surveys. Finally, we perform a multiparameter analysis of Ωm, σ8, h, ns, w0 and wa to compare the Gaussian and non-linear mode-coupled Fisher matrix contours. The 6D volume of the 1σ error contours for the non-linear Fisher analysis is a factor of 3 larger than for the Gaussian case, and the shape of the 68 per cent confidence volume is modified. We propose that future Fisher matrix estimates of cosmological parameter accuracies should include mode-coupling effects.
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference
NASA Astrophysics Data System (ADS)
Hall, Alex; Taylor, Andy
2017-06-01
We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Exploring Neutrino Oscillation Parameter Space with a Monte Carlo Algorithm
NASA Astrophysics Data System (ADS)
Espejel, Hugo; Ernst, David; Cogswell, Bernadette; Latimer, David
2015-04-01
The χ2 (or likelihood) function for a global analysis of neutrino oscillation data is first calculated as a function of the neutrino mixing parameters. A computational challenge is to obtain the minima or the allowed regions for the mixing parameters. The conventional approach is to calculate the χ2 (or likelihood) function on a grid for a large number of points, and then marginalize over the likelihood function. As the number of parameters increases with the number of neutrinos, making the calculation numerically efficient becomes necessary. We implement a new Monte Carlo algorithm (D. Foreman-Mackey, D. W. Hogg, D. Lang and J. Goodman, Publications of the Astronomical Society of the Pacific, 125 306 (2013)) to determine its computational efficiency at finding the minima and allowed regions. We examine a realistic example to compare the historical and the new methods.
Quantifying uncertainty in geoacoustic inversion. II. Application to broadband, shallow-water data.
Dosso, Stan E; Nielsen, Peter L
2002-01-01
This paper applies the new method of fast Gibbs sampling (FGS) to estimate the uncertainties of seabed geoacoustic parameters in a broadband, shallow-water acoustic survey, with the goal of interpreting the survey results and validating the method for experimental data. FGS applies a Bayesian approach to geoacoustic inversion based on sampling the posterior probability density to estimate marginal probability distributions and parameter covariances. This requires knowledge of the statistical distribution of the data errors, including both measurement and theory errors, which is generally not available. Invoking the simplifying assumption of independent, identically distributed Gaussian errors allows a maximum-likelihood estimate of the data variance and leads to a practical inversion algorithm. However, it is necessary to validate these assumptions, i.e., to verify that the parameter uncertainties obtained represent meaningful estimates. To this end, FGS is applied to a geoacoustic experiment carried out at a site off the west coast of Italy where previous acoustic and geophysical studies have been performed. The parameter uncertainties estimated via FGS are validated by comparison with: (i) the variability in the results of inverting multiple independent data sets collected during the experiment; (ii) the results of FGS inversion of synthetic test cases designed to simulate the experiment and data errors; and (iii) the available geophysical ground truth. Comparisons are carried out for a number of different source bandwidths, ranges, and levels of prior information, and indicate that FGS provides reliable and stable uncertainty estimates for the geoacoustic inverse problem.
Robust Flutter Margin Analysis that Incorporates Flight Data
NASA Technical Reports Server (NTRS)
Lind, Rick; Brenner, Martin J.
1998-01-01
An approach for computing worst-case flutter margins has been formulated in a robust stability framework. Uncertainty operators are included with a linear model to describe modeling errors and flight variations. The structured singular value, mu, computes a stability margin that directly accounts for these uncertainties. This approach introduces a new method of computing flutter margins and an associated new parameter for describing these margins. The mu margins are robust margins that indicate worst-case stability estimates with respect to the defined uncertainty. Worst-case flutter margins are computed for the F/A-18 Systems Research Aircraft using uncertainty sets generated by flight data analysis. The robust margins demonstrate flight conditions for flutter may lie closer to the flight envelope than previously estimated by p-k analysis.
NASA Technical Reports Server (NTRS)
Grove, R. D.; Bowles, R. L.; Mayhew, S. C.
1972-01-01
A maximum likelihood parameter estimation procedure and program were developed for the extraction of the stability and control derivatives of aircraft from flight test data. Nonlinear six-degree-of-freedom equations describing aircraft dynamics were used to derive sensitivity equations for quasilinearization. The maximum likelihood function with quasilinearization was used to derive the parameter change equations, the covariance matrices for the parameters and measurement noise, and the performance index function. The maximum likelihood estimator was mechanized into an iterative estimation procedure utilizing a real time digital computer and graphic display system. This program was developed for 8 measured state variables and 40 parameters. Test cases were conducted with simulated data for validation of the estimation procedure and program. The program was applied to a V/STOL tilt wing aircraft, a military fighter airplane, and a light single engine airplane. The particular nonlinear equations of motion, derivation of the sensitivity equations, addition of accelerations into the algorithm, operational features of the real time digital system, and test cases are described.
Hock, Sabrina; Hasenauer, Jan; Theis, Fabian J
2013-01-01
Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters. We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes. As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.
A Bootstrap Generalization of Modified Parallel Analysis for IRT Dimensionality Assessment
ERIC Educational Resources Information Center
Finch, Holmes; Monahan, Patrick
2008-01-01
This article introduces a bootstrap generalization to the Modified Parallel Analysis (MPA) method of test dimensionality assessment using factor analysis. This methodology, based on the use of Marginal Maximum Likelihood nonlinear factor analysis, provides for the calculation of a test statistic based on a parametric bootstrap using the MPA…
ERIC Educational Resources Information Center
Weissman, Alexander
2013-01-01
Convergence of the expectation-maximization (EM) algorithm to a global optimum of the marginal log likelihood function for unconstrained latent variable models with categorical indicators is presented. The sufficient conditions under which global convergence of the EM algorithm is attainable are provided in an information-theoretic context by…
Unwinding the hairball graph: Pruning algorithms for weighted complex networks
NASA Astrophysics Data System (ADS)
Dianati, Navid
2016-01-01
Empirical networks of weighted dyadic relations often contain "noisy" edges that alter the global characteristics of the network and obfuscate the most important structures therein. Graph pruning is the process of identifying the most significant edges according to a generative null model and extracting the subgraph consisting of those edges. Here, we focus on integer-weighted graphs commonly arising when weights count the occurrences of an "event" relating the nodes. We introduce a simple and intuitive null model related to the configuration model of network generation and derive two significance filters from it: the marginal likelihood filter (MLF) and the global likelihood filter (GLF). The former is a fast algorithm assigning a significance score to each edge based on the marginal distribution of edge weights, whereas the latter is an ensemble approach which takes into account the correlations among edges. We apply these filters to the network of air traffic volume between US airports and recover a geographically faithful representation of the graph. Furthermore, compared with thresholding based on edge weight, we show that our filters extract a larger and significantly sparser giant component.
Some Small Sample Results for Maximum Likelihood Estimation in Multidimensional Scaling.
ERIC Educational Resources Information Center
Ramsay, J. O.
1980-01-01
Some aspects of the small sample behavior of maximum likelihood estimates in multidimensional scaling are investigated with Monte Carlo techniques. In particular, the chi square test for dimensionality is examined and a correction for bias is proposed and evaluated. (Author/JKS)
A spatially explicit capture-recapture estimator for single-catch traps.
Distiller, Greg; Borchers, David L
2015-11-01
Single-catch traps are frequently used in live-trapping studies of small mammals. Thus far, a likelihood for single-catch traps has proven elusive and usually the likelihood for multicatch traps is used for spatially explicit capture-recapture (SECR) analyses of such data. Previous work found the multicatch likelihood to provide a robust estimator of average density. We build on a recently developed continuous-time model for SECR to derive a likelihood for single-catch traps. We use this to develop an estimator based on observed capture times and compare its performance by simulation to that of the multicatch estimator for various scenarios with nonconstant density surfaces. While the multicatch estimator is found to be a surprisingly robust estimator of average density, its performance deteriorates with high trap saturation and increasing density gradients. Moreover, it is found to be a poor estimator of the height of the detection function. By contrast, the single-catch estimators of density, distribution, and detection function parameters are found to be unbiased or nearly unbiased in all scenarios considered. This gain comes at the cost of higher variance. If there is no interest in interpreting the detection function parameters themselves, and if density is expected to be fairly constant over the survey region, then the multicatch estimator performs well with single-catch traps. However if accurate estimation of the detection function is of interest, or if density is expected to vary substantially in space, then there is merit in using the single-catch estimator when trap saturation is above about 60%. The estimator's performance is improved if care is taken to place traps so as to span the range of variables that affect animal distribution. As a single-catch likelihood with unknown capture times remains intractable for now, researchers using single-catch traps should aim to incorporate timing devices with their traps.
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE)
Boker, Steven M.; Brick, Timothy R.; Pritikin, Joshua N.; Wang, Yang; von Oertzen, Timo; Brown, Donald; Lach, John; Estabrook, Ryne; Hunter, Michael D.; Maes, Hermine H.; Neale, Michael C.
2015-01-01
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE) is a novel paradigm for research in the behavioral, social, and health sciences. The MIDDLE approach is based on the seemingly-impossible idea that data can be privately maintained by participants and never revealed to researchers, while still enabling statistical models to be fit and scientific hypotheses tested. MIDDLE rests on the assumption that participant data should belong to, be controlled by, and remain in the possession of the participants themselves. Distributed likelihood estimation refers to fitting statistical models by sending an objective function and vector of parameters to each participants’ personal device (e.g., smartphone, tablet, computer), where the likelihood of that individual’s data is calculated locally. Only the likelihood value is returned to the central optimizer. The optimizer aggregates likelihood values from responding participants and chooses new vectors of parameters until the model converges. A MIDDLE study provides significantly greater privacy for participants, automatic management of opt-in and opt-out consent, lower cost for the researcher and funding institute, and faster determination of results. Furthermore, if a participant opts into several studies simultaneously and opts into data sharing, these studies automatically have access to individual-level longitudinal data linked across all studies. PMID:26717128
Zhan, Tingting; Chevoneva, Inna; Iglewicz, Boris
2010-01-01
The family of weighted likelihood estimators largely overlaps with minimum divergence estimators. They are robust to data contaminations compared to MLE. We define the class of generalized weighted likelihood estimators (GWLE), provide its influence function and discuss the efficiency requirements. We introduce a new truncated cubic-inverse weight, which is both first and second order efficient and more robust than previously reported weights. We also discuss new ways of selecting the smoothing bandwidth and weighted starting values for the iterative algorithm. The advantage of the truncated cubic-inverse weight is illustrated in a simulation study of three-components normal mixtures model with large overlaps and heavy contaminations. A real data example is also provided. PMID:20835375
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Kohler, Pamela K; Manhart, Lisa E; Lafferty, William E
2008-04-01
The role that sex education plays in the initiation of sexual activity and risk of teen pregnancy and sexually transmitted disease (STD) is controversial in the United States. Despite several systematic reviews, few epidemiologic evaluations of the effectiveness of these programs on a population level have been conducted. Among never-married heterosexual adolescents, aged 15-19 years, who participated in Cycle 6 (2002) of the National Survey of Family Growth and reported on formal sex education received before their first sexual intercourse (n = 1719), we compared the sexual health risks of adolescents who received abstinence-only and comprehensive sex education to those of adolescents who received no formal sex education. Weighted multivariate logistic regression generated population-based estimates. Adolescents who received comprehensive sex education were significantly less likely to report teen pregnancy (OR(adj) = .4, 95% CI = .22- .69, p = .001) than those who received no formal sex education, whereas there was no significant effect of abstinence-only education (OR(adj) = .7, 95% CI = .38-1.45, p = .38). Abstinence-only education did not reduce the likelihood of engaging in vaginal intercourse (OR(adj) = .8, 95% CI = .51-1.31, p = .40), but comprehensive sex education was marginally associated with a lower likelihood of reporting having engaged in vaginal intercourse (OR(adj) = .7, 95% CI = .49-1.02, p = .06). Neither abstinence-only nor comprehensive sex education significantly reduced the likelihood of reported STD diagnoses (OR(adj) = 1.7, 95% CI = .57-34.76, p = .36 and OR(adj) = 1.8, 95% CI = .67-5.00, p = .24 respectively). Teaching about contraception was not associated with increased risk of adolescent sexual activity or STD. Adolescents who received comprehensive sex education had a lower risk of pregnancy than adolescents who received abstinence-only or no sex education.
New estimates of the CMB angular power spectra from the WMAP 5 year low-resolution data
NASA Astrophysics Data System (ADS)
Gruppuso, A.; de Rosa, A.; Cabella, P.; Paci, F.; Finelli, F.; Natoli, P.; de Gasperis, G.; Mandolesi, N.
2009-11-01
A quadratic maximum likelihood (QML) estimator is applied to the Wilkinson Microwave Anisotropy Probe (WMAP) 5 year low-resolution maps to compute the cosmic microwave background angular power spectra (APS) at large scales for both temperature and polarization. Estimates and error bars for the six APS are provided up to l = 32 and compared, when possible, to those obtained by the WMAP team, without finding any inconsistency. The conditional likelihood slices are also computed for the Cl of all the six power spectra from l = 2 to 10 through a pixel-based likelihood code. Both the codes treat the covariance for (T, Q, U) in a single matrix without employing any approximation. The inputs of both the codes (foreground-reduced maps, related covariances and masks) are provided by the WMAP team. The peaks of the likelihood slices are always consistent with the QML estimates within the error bars; however, an excellent agreement occurs when the QML estimates are used as a fiducial power spectrum instead of the best-fitting theoretical power spectrum. By the full computation of the conditional likelihood on the estimated spectra, the value of the temperature quadrupole CTTl=2 is found to be less than 2σ away from the WMAP 5 year Λ cold dark matter best-fitting value. The BB spectrum is found to be well consistent with zero, and upper limits on the B modes are provided. The parity odd signals TB and EB are found to be consistent with zero.
Handling nonresponse in surveys: analytic corrections compared with converting nonresponders.
Jenkins, Paul; Earle-Richardson, Giulia; Burdick, Patrick; May, John
2008-02-01
A large health survey was combined with a simulation study to contrast the reduction in bias achieved by double sampling versus two weighting methods based on propensity scores. The survey used a census of one New York county and double sampling in six others. Propensity scores were modeled as a logistic function of demographic variables and were used in conjunction with a random uniform variate to simulate response in the census. These data were used to estimate the prevalence of chronic disease in a population whose parameters were defined as values from the census. Significant (p < 0.0001) predictors in the logistic function included multiple (vs. single) occupancy (odds ratio (OR) = 1.3), bank card ownership (OR = 2.1), gender (OR = 1.5), home ownership (OR = 1.3), head of household's age (OR = 1.4), and income >$18,000 (OR = 0.8). The model likelihood ratio chi-square was significant (p < 0.0001), with the area under the receiver operating characteristic curve = 0.59. Double-sampling estimates were marginally closer to population values than those from either weighting method. However, the variance was also greater (p < 0.01). The reduction in bias for point estimation from double sampling may be more than offset by the increased variance associated with this method.
Sadreameli, S Christy; Riekert, Kristin A; Matsui, Elizabeth C; Rand, Cynthia S; Eakin, Michelle N
2018-05-03
Urban minority children are at risk for poor asthma outcomes and may not receive appropriate primary or subspecialty care. We hypothesized that preschool children with asthma whose caregivers reported more barriers to care would be less likely to have seen their primary care provider (PCP) or an asthma subspecialist and more likely to have had a recent emergency department (ED) visit for asthma. The Barriers to Care Questionnaire (BCQ) measures expectations, knowledge, marginalization, pragmatics, and skills. We assessed asthma control via TRACK and these outcomes: PCP visits for asthma in past six months, subspecialty care (allergist or pulmonologist) in past two years, and ED visits in past three months. 395 caregivers (96% African-American, 82% low-income, 96% Medicaid) completed the BCQ. Sixty percent(N=236) of children had uncontrolled asthma, 86% had seen a PCP, 23% had seen a subspecialist, and 29% had an ED visit. Barriers related to marginalization were associated with decreased likelihood of PCP (OR 0.95, p=0.014) and subspecialty visits (OR 0.92, p=0.019). Overall BCQ score was associated with decreased likelihood of subspecialty care (OR 0.98, p=0.027). Barriers related to expectations, knowledge, pragmatics, and skills were not associated with any of the care outcomes. Among low-income, predominantly African-American preschool children with asthma, primary and subspecialty care were less likely if caregivers reported past negative experiences with the healthcare system (marginalization). Clinicians who serve at-risk populations should be sensitive to families' past experiences and should consider designing interventions to target the most commonly reported barriers. Copyright © 2018. Published by Elsevier Inc.
Measuring fit of sequence data to phylogenetic model: gain of power using marginal tests.
Waddell, Peter J; Ota, Rissa; Penny, David
2009-10-01
Testing fit of data to model is fundamentally important to any science, but publications in the field of phylogenetics rarely do this. Such analyses discard fundamental aspects of science as prescribed by Karl Popper. Indeed, not without cause, Popper (Unended quest: an intellectual autobiography. Fontana, London, 1976) once argued that evolutionary biology was unscientific as its hypotheses were untestable. Here we trace developments in assessing fit from Penny et al. (Nature 297:197-200, 1982) to the present. We compare the general log-likelihood ratio (the G or G (2) statistic) statistic between the evolutionary tree model and the multinomial model with that of marginalized tests applied to an alignment (using placental mammal coding sequence data). It is seen that the most general test does not reject the fit of data to model (P approximately 0.5), but the marginalized tests do. Tests on pairwise frequency (F) matrices, strongly (P < 0.001) reject the most general phylogenetic (GTR) models commonly in use. It is also clear (P < 0.01) that the sequences are not stationary in their nucleotide composition. Deviations from stationarity and homogeneity seem to be unevenly distributed amongst taxa; not necessarily those expected from examining other regions of the genome. By marginalizing the 4( t ) patterns of the i.i.d. model to observed and expected parsimony counts, that is, from constant sites, to singletons, to parsimony informative characters of a minimum possible length, then the likelihood ratio test regains power, and it too rejects the evolutionary model with P < 0.001. Given such behavior over relatively recent evolutionary time, readers in general should maintain a healthy skepticism of results, as the scale of the systematic errors in published trees may really be far larger than the analytical methods (e.g., bootstrap) report.
Fleischer, A B; Feldman, S R; Barlow, J O; Zheng, B; Hahn, H B; Chuang, T Y; Draft, K S; Golitz, L E; Wu, E; Katz, A S; Maize, J C; Knapp, T; Leshin, B
2001-02-01
Basal cell carcinoma (BCC) is the most common cutaneous malignancy. Surgical experience and physician specialty may affect the outcome quality of surgical excision of BCC. We performed a multicenter retrospective study of BCC excisions submitted to the respective Departments of Pathology at 4 major university medical centers. Our outcome measure was presence of histologic evidence of tumor present in surgical margins of excision specimens (incomplete excision). Clinician experience was defined as the number of excisions that a clinician performed during the study interval. The analytic sample pool included 1459 tumors that met all inclusion and exclusion criteria. Analyses included univariate and multivariate techniques involving the entire sample and separate subsample analyses that excluded 2 outlying dermatologists. Tumor was present at the surgical margins in 243 (16.6%) of 1459 specimens. A patient's sex, age, and tumor size were not significantly related to the presence of tumor in the surgical margin. Physician experience did not demonstrate a significant difference either in the entire sample (P <.09) or in the subsample analysis (P >.30). Tumors of the head and neck were more likely to be incompletely excised than truncal tumors in all the analyses (P <.03). Compared with dermatologists, otolaryngologists (P <.02) and plastic surgeons (P <.008) were more likely to incompletely excise tumors; however, subsample analysis for plastic surgeons found only a trend toward significance (P <.10). Dermatologists and general surgeons did not differ in the likelihood of performing an incomplete excision (P >.4). The physician specialty may affect the quality of care in the surgical management of BCC.
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
NASA Astrophysics Data System (ADS)
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
Donato, David I.
2012-01-01
This report presents the mathematical expressions and the computational techniques required to compute maximum-likelihood estimates for the parameters of the National Descriptive Model of Mercury in Fish (NDMMF), a statistical model used to predict the concentration of methylmercury in fish tissue. The expressions and techniques reported here were prepared to support the development of custom software capable of computing NDMMF parameter estimates more quickly and using less computer memory than is currently possible with available general-purpose statistical software. Computation of maximum-likelihood estimates for the NDMMF by numerical solution of a system of simultaneous equations through repeated Newton-Raphson iterations is described. This report explains the derivation of the mathematical expressions required for computational parameter estimation in sufficient detail to facilitate future derivations for any revised versions of the NDMMF that may be developed.
Worst-Case Flutter Margins from F/A-18 Aircraft Aeroelastic Data
NASA Technical Reports Server (NTRS)
Lind, Rick; Brenner, Marty
1997-01-01
An approach for computing worst-case flutter margins has been formulated in a robust stability framework. Uncertainty operators are included with a linear model to describe modeling errors and flight variations. The structured singular value, micron, computes a stability margin which directly accounts for these uncertainties. This approach introduces a new method of computing flutter margins and an associated new parameter for describing these margins. The micron margins are robust margins which indicate worst-case stability estimates with respect to the defined uncertainty. Worst-case flutter margins are computed for the F/A-18 SRA using uncertainty sets generated by flight data analysis. The robust margins demonstrate flight conditions for flutter may lie closer to the flight envelope than previously estimated by p-k analysis.
Mass Properties for Space Systems Standards Development
NASA Technical Reports Server (NTRS)
Beech, Geoffrey
2013-01-01
Current Verbiage in S-120 Applies to Dry Mass. Mass Margin is difference between Required Mass and Predicted Mass. Performance Margin is difference between Predicted Performance and Required Performance. Performance estimates and corresponding margin should be based on Predicted Mass (and other inputs). Contractor Mass Margin reserved from Performance Margin. Remaining performance margin allocated according to mass partials. Compliance can be evaluated effectively by comparison of three areas (preferably on a single sheet). Basic and Predicted Mass (including historical trend). Aggregate potential changes (threats and opportunities) which gives Mass Forecast. Mass Maturity by category (Estimated/Calculated/Actual).
Maximum Likelihood Estimation of Nonlinear Structural Equation Models.
ERIC Educational Resources Information Center
Lee, Sik-Yum; Zhu, Hong-Tu
2002-01-01
Developed an EM type algorithm for maximum likelihood estimation of a general nonlinear structural equation model in which the E-step is completed by a Metropolis-Hastings algorithm. Illustrated the methodology with results from a simulation study and two real examples using data from previous studies. (SLD)
Bounded Linear Stability Analysis - A Time Delay Margin Estimation Approach for Adaptive Control
NASA Technical Reports Server (NTRS)
Nguyen, Nhan T.; Ishihara, Abraham K.; Krishnakumar, Kalmanje Srinlvas; Bakhtiari-Nejad, Maryam
2009-01-01
This paper presents a method for estimating time delay margin for model-reference adaptive control of systems with almost linear structured uncertainty. The bounded linear stability analysis method seeks to represent the conventional model-reference adaptive law by a locally bounded linear approximation within a small time window using the comparison lemma. The locally bounded linear approximation of the combined adaptive system is cast in a form of an input-time-delay differential equation over a small time window. The time delay margin of this system represents a local stability measure and is computed analytically by a matrix measure method, which provides a simple analytical technique for estimating an upper bound of time delay margin. Based on simulation results for a scalar model-reference adaptive control system, both the bounded linear stability method and the matrix measure method are seen to provide a reasonably accurate and yet not too conservative time delay margin estimation.
The discrete Laplace exponential family and estimation of Y-STR haplotype frequencies.
Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels
2013-07-21
Estimating haplotype frequencies is important in e.g. forensic genetics, where the frequencies are needed to calculate the likelihood ratio for the evidential weight of a DNA profile found at a crime scene. Estimation is naturally based on a population model, motivating the investigation of the Fisher-Wright model of evolution for haploid lineage DNA markers. An exponential family (a class of probability distributions that is well understood in probability theory such that inference is easily made by using existing software) called the 'discrete Laplace distribution' is described. We illustrate how well the discrete Laplace distribution approximates a more complicated distribution that arises by investigating the well-known population genetic Fisher-Wright model of evolution by a single-step mutation process. It was shown how the discrete Laplace distribution can be used to estimate haplotype frequencies for haploid lineage DNA markers (such as Y-chromosomal short tandem repeats), which in turn can be used to assess the evidential weight of a DNA profile found at a crime scene. This was done by making inference in a mixture of multivariate, marginally independent, discrete Laplace distributions using the EM algorithm to estimate the probabilities of membership of a set of unobserved subpopulations. The discrete Laplace distribution can be used to estimate haplotype frequencies with lower prediction error than other existing estimators. Furthermore, the calculations could be performed on a normal computer. This method was implemented in the freely available open source software R that is supported on Linux, MacOS and MS Windows. Copyright © 2013 Elsevier Ltd. All rights reserved.
Mixture Rasch Models with Joint Maximum Likelihood Estimation
ERIC Educational Resources Information Center
Willse, John T.
2011-01-01
This research provides a demonstration of the utility of mixture Rasch models. Specifically, a model capable of estimating a mixture partial credit model using joint maximum likelihood is presented. Like the partial credit model, the mixture partial credit model has the beneficial feature of being appropriate for analysis of assessment data…
Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood e...
The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.
ERIC Educational Resources Information Center
Baldwin, Beatrice
The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…
Maximum likelihood estimates, from censored data, for mixed-Weibull distributions
NASA Astrophysics Data System (ADS)
Jiang, Siyuan; Kececioglu, Dimitri
1992-06-01
A new algorithm for estimating the parameters of mixed-Weibull distributions from censored data is presented. The algorithm follows the principle of maximum likelihood estimate (MLE) through the expectation and maximization (EM) algorithm, and it is derived for both postmortem and nonpostmortem time-to-failure data. It is concluded that the concept of the EM algorithm is easy to understand and apply (only elementary statistics and calculus are required). The log-likelihood function cannot decrease after an EM sequence; this important feature was observed in all of the numerical calculations. The MLEs of the nonpostmortem data were obtained successfully for mixed-Weibull distributions with up to 14 parameters in a 5-subpopulation, mixed-Weibull distribution. Numerical examples indicate that some of the log-likelihood functions of the mixed-Weibull distributions have multiple local maxima; therefore, the algorithm should start at several initial guesses of the parameter set.
Ning, Jing; Chen, Yong; Piao, Jin
2017-07-01
Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Likelihood-based confidence intervals for estimating floods with given return periods
NASA Astrophysics Data System (ADS)
Martins, Eduardo Sávio P. R.; Clarke, Robin T.
1993-06-01
This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
An Evaluation of Hierarchical Bayes Estimation for the Two- Parameter Logistic Model.
ERIC Educational Resources Information Center
Kim, Seock-Ho
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item parameters. Simulated data sets were analyzed using two different Bayes estimation procedures, the two-stage hierarchical Bayes estimation (HB2) and the marginal Bayesian with known hyperparameters (MB), and marginal maximum…
Soil erosion and management activities on forested slopes
Robert R. Ziemer
1986-01-01
Some of the most productive forests in the Western United States grow on marginally stable mountainous slopes, where disturbance increases the likelihood of erosion. Much of the public's concern about, and, consequently, most of the research on, erosion from these forested areas is related more to the degradation of stream resources by eroded material than to the...
Social preferences toward energy generation with woody biomass from public forests in Montana, USA
Robert M. Campbell; Tyron J. Venn; Nathaniel M. Anderson
2016-01-01
In Montana, USA, there are substantial opportunities for mechanized thinning treatments on public forests to reduce the likelihood of severe and damaging wildfires and improve forest health. These treatments produce residues that can be used to generate renewable energy and displace fossil fuels. The choice modeling method is employed to examine the marginal...
State Need-Based Aid and Four-Year College Student Retention: A Statewide Study
ERIC Educational Resources Information Center
McFall, Kara Lynn
2013-01-01
Every college age student should have the opportunity to attend college and earn a degree, but the fiscal realities for lower income students prevent the majority from attending and the vast majority from completing college, thus perpetuating an intergenerational trend of limited postsecondary education and a likelihood of marginal income and…
Ting, Chih-Chung; Yu, Chia-Chen; Maloney, Laurence T.
2015-01-01
In Bayesian decision theory, knowledge about the probabilities of possible outcomes is captured by a prior distribution and a likelihood function. The prior reflects past knowledge and the likelihood summarizes current sensory information. The two combined (integrated) form a posterior distribution that allows estimation of the probability of different possible outcomes. In this study, we investigated the neural mechanisms underlying Bayesian integration using a novel lottery decision task in which both prior knowledge and likelihood information about reward probability were systematically manipulated on a trial-by-trial basis. Consistent with Bayesian integration, as sample size increased, subjects tended to weigh likelihood information more compared with prior information. Using fMRI in humans, we found that the medial prefrontal cortex (mPFC) correlated with the mean of the posterior distribution, a statistic that reflects the integration of prior knowledge and likelihood of reward probability. Subsequent analysis revealed that both prior and likelihood information were represented in mPFC and that the neural representations of prior and likelihood in mPFC reflected changes in the behaviorally estimated weights assigned to these different sources of information in response to changes in the environment. Together, these results establish the role of mPFC in prior-likelihood integration and highlight its involvement in representing and integrating these distinct sources of information. PMID:25632152
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation
Meyer, Karin
2016-01-01
Multivariate estimates of genetic parameters are subject to substantial sampling variation, especially for smaller data sets and more than a few traits. A simple modification of standard, maximum-likelihood procedures for multivariate analyses to estimate genetic covariances is described, which can improve estimates by substantially reducing their sampling variances. This is achieved by maximizing the likelihood subject to a penalty. Borrowing from Bayesian principles, we propose a mild, default penalty—derived assuming a Beta distribution of scale-free functions of the covariance components to be estimated—rather than laboriously attempting to determine the stringency of penalization from the data. An extensive simulation study is presented, demonstrating that such penalties can yield very worthwhile reductions in loss, i.e., the difference from population values, for a wide range of scenarios and without distorting estimates of phenotypic covariances. Moreover, mild default penalties tend not to increase loss in difficult cases and, on average, achieve reductions in loss of similar magnitude to computationally demanding schemes to optimize the degree of penalization. Pertinent details required for the adaptation of standard algorithms to locate the maximum of the likelihood function are outlined. PMID:27317681
ERIC Educational Resources Information Center
Beauducel, Andre; Herzberg, Philipp Yorck
2006-01-01
This simulation study compared maximum likelihood (ML) estimation with weighted least squares means and variance adjusted (WLSMV) estimation. The study was based on confirmatory factor analyses with 1, 2, 4, and 8 factors, based on 250, 500, 750, and 1,000 cases, and on 5, 10, 20, and 40 variables with 2, 3, 4, 5, and 6 categories. There was no…
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling
Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.
2012-01-01
This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
ERIC Educational Resources Information Center
Klein, Andreas G.; Muthen, Bengt O.
2007-01-01
In this article, a nonlinear structural equation model is introduced and a quasi-maximum likelihood method for simultaneous estimation and testing of multiple nonlinear effects is developed. The focus of the new methodology lies on efficiency, robustness, and computational practicability. Monte-Carlo studies indicate that the method is highly…
Likelihood-Based Confidence Intervals in Exploratory Factor Analysis
ERIC Educational Resources Information Center
Oort, Frans J.
2011-01-01
In exploratory or unrestricted factor analysis, all factor loadings are free to be estimated. In oblique solutions, the correlations between common factors are free to be estimated as well. The purpose of this article is to show how likelihood-based confidence intervals can be obtained for rotated factor loadings and factor correlations, by…
Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth
ERIC Educational Resources Information Center
Jeon, Minjeong
2012-01-01
Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
ERIC Educational Resources Information Center
Adank, Patti
2012-01-01
The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…
Deep Unfolding for Topic Models.
Chien, Jen-Tzung; Lee, Chao-Hsi
2018-02-01
Deep unfolding provides an approach to integrate the probabilistic generative models and the deterministic neural networks. Such an approach is benefited by deep representation, easy interpretation, flexible learning and stochastic modeling. This study develops the unsupervised and supervised learning of deep unfolded topic models for document representation and classification. Conventionally, the unsupervised and supervised topic models are inferred via the variational inference algorithm where the model parameters are estimated by maximizing the lower bound of logarithm of marginal likelihood using input documents without and with class labels, respectively. The representation capability or classification accuracy is constrained by the variational lower bound and the tied model parameters across inference procedure. This paper aims to relax these constraints by directly maximizing the end performance criterion and continuously untying the parameters in learning process via deep unfolding inference (DUI). The inference procedure is treated as the layer-wise learning in a deep neural network. The end performance is iteratively improved by using the estimated topic parameters according to the exponentiated updates. Deep learning of topic models is therefore implemented through a back-propagation procedure. Experimental results show the merits of DUI with increasing number of layers compared with variational inference in unsupervised as well as supervised topic models.
Semiparametric regression analysis of interval-censored competing risks data.
Mao, Lu; Lin, Dan-Yu; Zeng, Donglin
2017-09-01
Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes. © 2017, The International Biometric Society.
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models.
Daunizeau, J; Friston, K J; Kiebel, S J
2009-11-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.
Tolkoff, Max R; Alfaro, Michael E; Baele, Guy; Lemey, Philippe; Suchard, Marc A
2018-05-01
Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over 3-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.
Hierarchical Probabilistic Inference of Cosmic Shear
NASA Astrophysics Data System (ADS)
Schneider, Michael D.; Hogg, David W.; Marshall, Philip J.; Dawson, William A.; Meyers, Joshua; Bard, Deborah J.; Lang, Dustin
2015-07-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics.
Approximated maximum likelihood estimation in multifractal random walks
NASA Astrophysics Data System (ADS)
Løvsletten, O.; Rypdal, M.
2012-04-01
We present an approximated maximum likelihood method for the multifractal random walk processes of [E. Bacry , Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.64.026103 64, 026103 (2001)]. The likelihood is computed using a Laplace approximation and a truncation in the dependency structure for the latent volatility. The procedure is implemented as a package in the r computer language. Its performance is tested on synthetic data and compared to an inference approach based on the generalized method of moments. The method is applied to estimate parameters for various financial stock indices.
Two models for evaluating landslide hazards
Davis, J.C.; Chung, C.-J.; Ohlmacher, G.C.
2006-01-01
Two alternative procedures for estimating landslide hazards were evaluated using data on topographic digital elevation models (DEMs) and bedrock lithologies in an area adjacent to the Missouri River in Atchison County, Kansas, USA. The two procedures are based on the likelihood ratio model but utilize different assumptions. The empirical likelihood ratio model is based on non-parametric empirical univariate frequency distribution functions under an assumption of conditional independence while the multivariate logistic discriminant model assumes that likelihood ratios can be expressed in terms of logistic functions. The relative hazards of occurrence of landslides were estimated by an empirical likelihood ratio model and by multivariate logistic discriminant analysis. Predictor variables consisted of grids containing topographic elevations, slope angles, and slope aspects calculated from a 30-m DEM. An integer grid of coded bedrock lithologies taken from digitized geologic maps was also used as a predictor variable. Both statistical models yield relative estimates in the form of the proportion of total map area predicted to already contain or to be the site of future landslides. The stabilities of estimates were checked by cross-validation of results from random subsamples, using each of the two procedures. Cell-by-cell comparisons of hazard maps made by the two models show that the two sets of estimates are virtually identical. This suggests that the empirical likelihood ratio and the logistic discriminant analysis models are robust with respect to the conditional independent assumption and the logistic function assumption, respectively, and that either model can be used successfully to evaluate landslide hazards. ?? 2006.
Li, Xiang; Kuk, Anthony Y C; Xu, Jinfeng
2014-12-10
Human biomonitoring of exposure to environmental chemicals is important. Individual monitoring is not viable because of low individual exposure level or insufficient volume of materials and the prohibitive cost of taking measurements from many subjects. Pooling of samples is an efficient and cost-effective way to collect data. Estimation is, however, complicated as individual values within each pool are not observed but are only known up to their average or weighted average. The distribution of such averages is intractable when the individual measurements are lognormally distributed, which is a common assumption. We propose to replace the intractable distribution of the pool averages by a Gaussian likelihood to obtain parameter estimates. If the pool size is large, this method produces statistically efficient estimates, but regardless of pool size, the method yields consistent estimates as the number of pools increases. An empirical Bayes (EB) Gaussian likelihood approach, as well as its Bayesian analog, is developed to pool information from various demographic groups by using a mixed-effect formulation. We also discuss methods to estimate the underlying mean-variance relationship and to select a good model for the means, which can be incorporated into the proposed EB or Bayes framework. By borrowing strength across groups, the EB estimator is more efficient than the individual group-specific estimator. Simulation results show that the EB Gaussian likelihood estimates outperform a previous method proposed for the National Health and Nutrition Examination Surveys with much smaller bias and better coverage in interval estimation, especially after correction of bias. Copyright © 2014 John Wiley & Sons, Ltd.
Harbert, Robert S; Nixon, Kevin C
2015-08-01
• Plant distributions have long been understood to be correlated with the environmental conditions to which species are adapted. Climate is one of the major components driving species distributions. Therefore, it is expected that the plants coexisting in a community are reflective of the local environment, particularly climate.• Presented here is a method for the estimation of climate from local plant species coexistence data. The method, Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE), is a likelihood-based method that employs specimen collection data at a global scale for the inference of species climate tolerance. CRACLE calculates the maximum joint likelihood of coexistence given individual species climate tolerance characterization to estimate the expected climate.• Plant distribution data for more than 4000 species were used to show that this method accurately infers expected climate profiles for 165 sites with diverse climatic conditions. Estimates differ from the WorldClim global climate model by less than 1.5°C on average for mean annual temperature and less than ∼250 mm for mean annual precipitation. This is a significant improvement upon other plant-based climate-proxy methods.• CRACLE validates long hypothesized interactions between climate and local associations of plant species. Furthermore, CRACLE successfully estimates climate that is consistent with the widely used WorldClim model and therefore may be applied to the quantitative estimation of paleoclimate in future studies. © 2015 Botanical Society of America, Inc.
Marginal Economic Value of Streamflow: A Case Study for the Colorado River Basin
NASA Astrophysics Data System (ADS)
Brown, Thomas C.; Harding, Benjamin L.; Payton, Elizabeth A.
1990-12-01
The marginal economic value of streamflow leaving forested areas in the Colorado River Basin was estimated by determining the impact on water use of a small change in streamflow and then applying economic value estimates to the water use changes. The effect on water use of a change in streamflow was estimated with a network flow model that simulated salinity levels and the routing of flow to consumptive uses and hydroelectric dams throughout the Basin. The results show that, under current water management institutions, the marginal value of streamflow in the Colorado River Basin is largely determined by nonconsumptive water uses, principally energy production, rather than by consumptive agricultural or municipal uses. The analysis demonstrates the importance of a systems framework in estimating the marginal value of streamflow.
On Time Delay Margin Estimation for Adaptive Control and Optimal Control Modification
NASA Technical Reports Server (NTRS)
Nguyen, Nhan T.
2011-01-01
This paper presents methods for estimating time delay margin for adaptive control of input delay systems with almost linear structured uncertainty. The bounded linear stability analysis method seeks to represent an adaptive law by a locally bounded linear approximation within a small time window. The time delay margin of this input delay system represents a local stability measure and is computed analytically by three methods: Pade approximation, Lyapunov-Krasovskii method, and the matrix measure method. These methods are applied to the standard model-reference adaptive control, s-modification adaptive law, and optimal control modification adaptive law. The windowing analysis results in non-unique estimates of the time delay margin since it is dependent on the length of a time window and parameters which vary from one time window to the next. The optimal control modification adaptive law overcomes this limitation in that, as the adaptive gain tends to infinity and if the matched uncertainty is linear, then the closed-loop input delay system tends to a LTI system. A lower bound of the time delay margin of this system can then be estimated uniquely without the need for the windowing analysis. Simulation results demonstrates the feasibility of the bounded linear stability method for time delay margin estimation.
Salem, Shady; Chang, Sam S; Clark, Peter E; Davis, Rodney; Herrell, S Duke; Kordan, Yakup; Wills, Marcia L; Shappell, Scott B; Baumgartner, Roxelyn; Phillips, Sharon; Smith, Joseph A; Cookson, Michael S; Barocas, Daniel A
2010-10-01
Whole mount processing is more resource intensive than routine systematic sampling of radical retropubic prostatectomy specimens. We compared whole mount and systematic sampling for detecting pathological outcomes, and compared the prognostic value of pathological findings across pathological methods. We included men (608 whole mount and 525 systematic sampling samples) with no prior treatment who underwent radical retropubic prostatectomy at Vanderbilt University Medical Center between January 2000 and June 2008. We used univariate and multivariate analysis to compare the pathological outcome detection rate between pathological methods. Kaplan-Meier curves and the log rank test were used to compare the prognostic value of pathological findings across pathological methods. There were no significant differences between the whole mount and the systematic sampling groups in detecting extraprostatic extension (25% vs 30%), positive surgical margins (31% vs 31%), pathological Gleason score less than 7 (49% vs 43%), 7 (39% vs 43%) or greater than 7 (12% vs 13%), seminal vesicle invasion (8% vs 10%) or lymph node involvement (3% vs 5%). Tumor volume was higher in the systematic sampling group and whole mount detected more multiple surgical margins (each p <0.01). There were no significant differences in the likelihood of biochemical recurrence between the pathological methods when patients were stratified by pathological outcome. Except for estimated tumor volume and multiple margins whole mount and systematic sampling yield similar pathological information. Each method stratifies patients into comparable risk groups for biochemical recurrence. Thus, while whole mount is more resource intensive, it does not appear to result in improved detection of clinically important pathological outcomes or prognostication. Copyright © 2010 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
A non-stationary cost-benefit based bivariate extreme flood estimation approach
NASA Astrophysics Data System (ADS)
Qi, Wei; Liu, Junguo
2018-02-01
Cost-benefit analysis and flood frequency analysis have been integrated into a comprehensive framework to estimate cost effective design values. However, previous cost-benefit based extreme flood estimation is based on stationary assumptions and analyze dependent flood variables separately. A Non-Stationary Cost-Benefit based bivariate design flood estimation (NSCOBE) approach is developed in this study to investigate influence of non-stationarities in both the dependence of flood variables and the marginal distributions on extreme flood estimation. The dependence is modeled utilizing copula functions. Previous design flood selection criteria are not suitable for NSCOBE since they ignore time changing dependence of flood variables. Therefore, a risk calculation approach is proposed based on non-stationarities in both marginal probability distributions and copula functions. A case study with 54-year observed data is utilized to illustrate the application of NSCOBE. Results show NSCOBE can effectively integrate non-stationarities in both copula functions and marginal distributions into cost-benefit based design flood estimation. It is also found that there is a trade-off between maximum probability of exceedance calculated from copula functions and marginal distributions. This study for the first time provides a new approach towards a better understanding of influence of non-stationarities in both copula functions and marginal distributions on extreme flood estimation, and could be beneficial to cost-benefit based non-stationary bivariate design flood estimation across the world.
ERIC Educational Resources Information Center
Han, Kyung T.; Guo, Fanmin
2014-01-01
The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
Constrained Maximum Likelihood Estimation for Two-Level Mean and Covariance Structure Models
ERIC Educational Resources Information Center
Bentler, Peter M.; Liang, Jiajuan; Tang, Man-Lai; Yuan, Ke-Hai
2011-01-01
Maximum likelihood is commonly used for the estimation of model parameters in the analysis of two-level structural equation models. Constraints on model parameters could be encountered in some situations such as equal factor loadings for different factors. Linear constraints are the most common ones and they are relatively easy to handle in…
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Risk Indicators for Periodontitis in US Adults: NHANES 2009 to 2012.
Eke, Paul I; Wei, Liang; Thornton-Evans, Gina O; Borrell, Luisa N; Borgnakke, Wenche S; Dye, Bruce; Genco, Robert J
2016-10-01
Through the use of optimal surveillance measures and standard case definitions, it is now possible to more accurately determine population-average risk profiles for severe (SP) and non-severe periodontitis (NSP) in adults (aged 30 years and older) in the United States. Data from the 2009 to 2012 National Health and Nutrition Examination Survey were used, which, for the first time, used the "gold standard" full-mouth periodontitis surveillance protocol to classify severity of periodontitis following suggested Centers for Disease Control/American Academy of Periodontology case definitions. Probabilities of periodontitis by: 1) sociodemographics, 2) behavioral factors, and 3) comorbid conditions were assessed using prevalence ratios (PRs) estimated by predicted marginal probability from multivariable generalized logistic regression models. Analyses were further stratified by sex for each classification of periodontitis. Likelihood of total periodontitis (TP) increased with age for overall and NSP relative to non-periodontitis. Compared with non-Hispanic whites, TP was more likely in Hispanics (adjusted [a]PR = 1.38; 95% confidence interval 95% CI: 1.26 to 1.52) and non-Hispanic blacks (aPR = 1.35; 95% CI: 1.22 to 1.50), whereas SP was most likely in non-Hispanic blacks (aPR = 1.82; 95% CI: 1.44 to 2.31). There was at least a 50% greater likelihood of TP in current smokers compared with non-smokers. In males, likelihood of TP in adults aged 65 years and older was greater (aPR = 2.07; 95% CI: 1.76 to 2.43) than adults aged 30 to 44 years. This probability was even greater in women (aPR = 3.15; 95% CI: 2.63 to 3.77). Likelihood of TP was higher in current smokers relative to non-smokers regardless of sex and periodontitis classification. TP was more likely in men with uncontrolled diabetes mellitus (DM) compared with adults without DM. Assessment of risk profiles for periodontitis in adults in the United States based on gold standard periodontal measures show important differences by severity of disease and sex. Cigarette smoking, specifically current smoking, remains an important modifiable risk for all levels of periodontitis severity. Higher likelihood of TP in older adults and in males with uncontrolled DM is noteworthy. These findings could improve identification of target populations for effective public health interventions to improve periodontal health of adults in the United States.
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers
Runkel, Robert L.; Crawford, Charles G.; Cohn, Timothy A.
2004-01-01
LOAD ESTimator (LOADEST) is a FORTRAN program for estimating constituent loads in streams and rivers. Given a time series of streamflow, additional data variables, and constituent concentration, LOADEST assists the user in developing a regression model for the estimation of constituent load (calibration). Explanatory variables within the regression model include various functions of streamflow, decimal time, and additional user-specified data variables. The formulated regression model then is used to estimate loads over a user-specified time interval (estimation). Mean load estimates, standard errors, and 95 percent confidence intervals are developed on a monthly and(or) seasonal basis. The calibration and estimation procedures within LOADEST are based on three statistical estimation methods. The first two methods, Adjusted Maximum Likelihood Estimation (AMLE) and Maximum Likelihood Estimation (MLE), are appropriate when the calibration model errors (residuals) are normally distributed. Of the two, AMLE is the method of choice when the calibration data set (time series of streamflow, additional data variables, and concentration) contains censored data. The third method, Least Absolute Deviation (LAD), is an alternative to maximum likelihood estimation when the residuals are not normally distributed. LOADEST output includes diagnostic tests and warnings to assist the user in determining the appropriate estimation method and in interpreting the estimated loads. This report describes the development and application of LOADEST. Sections of the report describe estimation theory, input/output specifications, sample applications, and installation instructions.
Maximum Likelihood Shift Estimation Using High Resolution Polarimetric SAR Clutter Model
NASA Astrophysics Data System (ADS)
Harant, Olivier; Bombrun, Lionel; Vasile, Gabriel; Ferro-Famil, Laurent; Gay, Michel
2011-03-01
This paper deals with a Maximum Likelihood (ML) shift estimation method in the context of High Resolution (HR) Polarimetric SAR (PolSAR) clutter. Texture modeling is exposed and the generalized ML texture tracking method is extended to the merging of various sensors. Some results on displacement estimation on the Argentiere glacier in the Mont Blanc massif using dual-pol TerraSAR-X (TSX) and quad-pol RADARSAT-2 (RS2) sensors are finally discussed.
Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley
2013-12-15
The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
Nonparametric probability density estimation by optimization theoretic techniques
NASA Technical Reports Server (NTRS)
Scott, D. W.
1976-01-01
Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.
Improved Margin of Error Estimates for Proportions in Business: An Educational Example
ERIC Educational Resources Information Center
Arzumanyan, George; Halcoussis, Dennis; Phillips, G. Michael
2015-01-01
This paper presents the Agresti & Coull "Adjusted Wald" method for computing confidence intervals and margins of error for common proportion estimates. The presented method is easily implementable by business students and practitioners and provides more accurate estimates of proportions particularly in extreme samples and small…
Sparse Bayesian Learning for Identifying Imaging Biomarkers in AD Prediction
Shen, Li; Qi, Yuan; Kim, Sungeun; Nho, Kwangsik; Wan, Jing; Risacher, Shannon L.; Saykin, Andrew J.
2010-01-01
We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer’s disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures. PMID:20879451
Score tests for independence in semiparametric competing risks models.
Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul
2009-12-01
A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.
Migration, business formation, and the informal economy in urban Mexico.
Sheehan, Connor M; Riosmena, Fernando
2013-07-01
Although the informal economy has grown rapidly in several developing nations, and migration and informality may be related to similar types of credit constraints and market failures, previous research has not systematically attempted to identify if migrant households are more likely to start informal and formal businesses alike and if this association varies across local contexts. We examine the relationship between prior US migration and the creation of both formal and informal businesses in urban Mexico using several criteria to indirectly assess sector location. We use data from 56 communities from the Mexican Migration Project to estimate multilevel survival and nonmultilevel competing risk models predicting the likelihood of informal, formal, and no business formation. The recent return migration of the household head is strongly associated with informal business creation, particularly in economically dynamic areas. On the other hand, migrants are only marginally more likely to start formal businesses in highly economically dynamic sending areas. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, M.
1980-12-01
The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that themore » use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates.« less
Cosmological parameter estimation using Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Prasad, J.; Souradeep, T.
2014-03-01
Constraining parameters of a theoretical model from observational data is an important exercise in cosmology. There are many theoretically motivated models, which demand greater number of cosmological parameters than the standard model of cosmology uses, and make the problem of parameter estimation challenging. It is a common practice to employ Bayesian formalism for parameter estimation for which, in general, likelihood surface is probed. For the standard cosmological model with six parameters, likelihood surface is quite smooth and does not have local maxima, and sampling based methods like Markov Chain Monte Carlo (MCMC) method are quite successful. However, when there are a large number of parameters or the likelihood surface is not smooth, other methods may be more effective. In this paper, we have demonstrated application of another method inspired from artificial intelligence, called Particle Swarm Optimization (PSO) for estimating cosmological parameters from Cosmic Microwave Background (CMB) data taken from the WMAP satellite.
Huckfeldt, Peter J; Sood, Neeraj; Escarce, José J; Grabowski, David C; Newhouse, Joseph P
2014-03-01
Medicare continues to implement payment reforms that shift reimbursement from fee-for-service toward episode-based payment, affecting average and marginal payment. We contrast the effects of two reforms for home health agencies. The home health interim payment system in 1997 lowered both types of payment; our conceptual model predicts a decline in the likelihood of use and costs, both of which we find. The home health prospective payment system in 2000 raised average but lowered marginal payment with theoretically ambiguous effects; we find a modest increase in use and costs. We find little substantive effect of either policy on readmissions or mortality. Copyright © 2014 Elsevier B.V. All rights reserved.
Huckfeldt, Peter J; Sood, Neeraj; Escarce, José J; Grabowski, David C; Newhouse, Joseph P
2014-01-01
Medicare continues to implement payment reforms that shift reimbursement from fee-for-service towards episode-based payment, affecting average and marginal payment. We contrast the effects of two reforms for home health agencies. The Home Health Interim Payment System in 1997 lowered both types of payment; our conceptual model predicts a decline in the likelihood of use and costs, both of which we find. The Home Health Prospective Payment System in 2000 raised average but lowered marginal payment with theoretically ambiguous effects; we find a modest increase in use and costs. We find little substantive effect of either policy on readmissions or mortality. PMID:24395018
The Extended-Image Tracking Technique Based on the Maximum Likelihood Estimation
NASA Technical Reports Server (NTRS)
Tsou, Haiping; Yan, Tsun-Yee
2000-01-01
This paper describes an extended-image tracking technique based on the maximum likelihood estimation. The target image is assume to have a known profile covering more than one element of a focal plane detector array. It is assumed that the relative position between the imager and the target is changing with time and the received target image has each of its pixels disturbed by an independent additive white Gaussian noise. When a rotation-invariant movement between imager and target is considered, the maximum likelihood based image tracking technique described in this paper is a closed-loop structure capable of providing iterative update of the movement estimate by calculating the loop feedback signals from a weighted correlation between the currently received target image and the previously estimated reference image in the transform domain. The movement estimate is then used to direct the imager to closely follow the moving target. This image tracking technique has many potential applications, including free-space optical communications and astronomy where accurate and stabilized optical pointing is essential.
Reyes-Valdés, M H; Stelly, D M
1995-01-01
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes. Images Fig. 1 PMID:7568226
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2010-01-01
In this article the authors focus on the issue of the nonuniqueness of the maximum likelihood (ML) estimator of proficiency level in item response theory (with special attention to logistic models). The usual maximum a posteriori (MAP) method offers a good alternative within that framework; however, this article highlights some drawbacks of its…
Duan, Fumei; Wang, Yong; Wang, Ying; Zhao, Han
2018-06-16
The calculation of marginal abatement costs of CO 2 plays a vital role in meeting China's 2020 emission reduction targets by providing reference for determining carbon tax and carbon trading pricing. However, most existing researches only used one method to discuss regional and industrial marginal abatement costs, and almost no studies predicted future marginal abatement costs from the perspective of CO 2 emission efficiency. To make up for the gaps, this paper first estimates marginal abatement costs of CO 2 in three major industries of 30 provinces in China from 2005 to 2015 based on three assumptions. Second, based on the principle of fairness and efficiency, China's 2020 emission reduction targets are decomposed by province. Based on the ZSG-C-DDF model, the marginal abatement costs of CO 2 in all provinces in China in 2020 are estimated and compared with the marginal abatement costs of 2005 to 2015. The results show that (1) from 2005 to 2015, marginal abatement costs of CO 2 in all provinces show a fluctuating upward trend; (2) compared with the marginal abatement costs of primary industry or tertiary industry, most provinces have lower marginal abatement costs for secondary industry; and (3) the average marginal abatement costs of CO 2 for China in 2020 are 2766.882 Yuan/tonne for the 40% carbon intensity reduction target and 3334.836 Yuan/tonne for the 45% target, showing that the higher the emission reduction target, the higher the marginal abatement costs of CO 2 . (4) Overall, the average marginal abatement costs of CO 2 in China by 2020 are higher than those in 2005-2015. The empirical analysis in this paper can provide multiple references for environmental policy makers.
ERIC Educational Resources Information Center
James, David E.; Schraw, Gregory; Kuch, Fred
2015-01-01
We present an equation, derived from standard statistical theory, that can be used to estimate sampling margin of error for student evaluations of teaching (SETs). We use the equation to examine the effect of sample size, response rates and sample variability on the estimated sampling margin of error, and present results in four tables that allow…
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis
van de Schoot, Rens; Hox, Joop
2014-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions. PMID:29795827
How much to trust the senses: Likelihood learning
Sato, Yoshiyuki; Kording, Konrad P.
2014-01-01
Our brain often needs to estimate unknown variables from imperfect information. Our knowledge about the statistical distributions of quantities in our environment (called priors) and currently available information from sensory inputs (called likelihood) are the basis of all Bayesian models of perception and action. While we know that priors are learned, most studies of prior-likelihood integration simply assume that subjects know about the likelihood. However, as the quality of sensory inputs change over time, we also need to learn about new likelihoods. Here, we show that human subjects readily learn the distribution of visual cues (likelihood function) in a way that can be predicted by models of statistically optimal learning. Using a likelihood that depended on color context, we found that a learned likelihood generalized to new priors. Thus, we conclude that subjects learn about likelihood. PMID:25398975
Balashov, Iu S; Grigor'eva, L A; Leonovich, S A
2009-01-01
A method of visual estimation of the biological age of living hungry tick females by visible changes in the depth of marginal groove and the structure of the alloscutum cuticle during natural ageing is developed. In recently activated individuals, the body is convex and the marginal groove is exposed, demonstrating distinctly visible cuticular microfolds (Figs 1-4). In attenuated ticks, the body is flattened and marginal fold overlays the marginal groove, concealing cuticular microfolds (Figs 5-8).
The performance of different propensity score methods for estimating marginal hazard ratios.
Austin, Peter C
2013-07-20
Propensity score methods are increasingly being used to reduce or minimize the effects of confounding when estimating the effects of treatments, exposures, or interventions when using observational or non-randomized data. Under the assumption of no unmeasured confounders, previous research has shown that propensity score methods allow for unbiased estimation of linear treatment effects (e.g., differences in means or proportions). However, in biomedical research, time-to-event outcomes occur frequently. There is a paucity of research into the performance of different propensity score methods for estimating the effect of treatment on time-to-event outcomes. Furthermore, propensity score methods allow for the estimation of marginal or population-average treatment effects. We conducted an extensive series of Monte Carlo simulations to examine the performance of propensity score matching (1:1 greedy nearest-neighbor matching within propensity score calipers), stratification on the propensity score, inverse probability of treatment weighting (IPTW) using the propensity score, and covariate adjustment using the propensity score to estimate marginal hazard ratios. We found that both propensity score matching and IPTW using the propensity score allow for the estimation of marginal hazard ratios with minimal bias. Of these two approaches, IPTW using the propensity score resulted in estimates with lower mean squared error when estimating the effect of treatment in the treated. Stratification on the propensity score and covariate adjustment using the propensity score result in biased estimation of both marginal and conditional hazard ratios. Applied researchers are encouraged to use propensity score matching and IPTW using the propensity score when estimating the relative effect of treatment on time-to-event outcomes. Copyright © 2012 John Wiley & Sons, Ltd.
Julien, Clavel; Leandro, Aristide; Hélène, Morlon
2018-06-19
Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Method and system for diagnostics of apparatus
NASA Technical Reports Server (NTRS)
Gorinevsky, Dimitry (Inventor)
2012-01-01
Proposed is a method, implemented in software, for estimating fault state of an apparatus outfitted with sensors. At each execution period the method processes sensor data from the apparatus to obtain a set of parity parameters, which are further used for estimating fault state. The estimation method formulates a convex optimization problem for each fault hypothesis and employs a convex solver to compute fault parameter estimates and fault likelihoods for each fault hypothesis. The highest likelihoods and corresponding parameter estimates are transmitted to a display device or an automated decision and control system. The obtained accurate estimate of fault state can be used to improve safety, performance, or maintenance processes for the apparatus.
On the occurrence of false positives in tests of migration under an isolation with migration model
Hey, Jody; Chung, Yujin; Sethuraman, Arun
2015-01-01
The population genetic study of divergence is often done using a Bayesian genealogy sampler, like those implemented in IMa2 and related programs, and these analyses frequently include a likelihood-ratio test of the null hypothesis of no migration between populations. Cruickshank and Hahn (2014, Molecular Ecology, 23, 3133–3157) recently reported a high rate of false positive test results with IMa2 for data simulated with small numbers of loci under models with no migration and recent splitting times. We confirm these findings and discover that they are caused by a failure of the assumptions underlying likelihood ratio tests that arises when using marginal likelihoods for a subset of model parameters. We also show that for small data sets, with little divergence between samples from two populations, an excellent fit can often be found by a model with a low migration rate and recent splitting time and a model with a high migration rate and a deep splitting time. PMID:26456794
NASA Astrophysics Data System (ADS)
Dang, H.; Wang, A. S.; Sussman, Marc S.; Siewerdsen, J. H.; Stayman, J. W.
2014-09-01
Sequential imaging studies are conducted in many clinical scenarios. Prior images from previous studies contain a great deal of patient-specific anatomical information and can be used in conjunction with subsequent imaging acquisitions to maintain image quality while enabling radiation dose reduction (e.g., through sparse angular sampling, reduction in fluence, etc). However, patient motion between images in such sequences results in misregistration between the prior image and current anatomy. Existing prior-image-based approaches often include only a simple rigid registration step that can be insufficient for capturing complex anatomical motion, introducing detrimental effects in subsequent image reconstruction. In this work, we propose a joint framework that estimates the 3D deformation between an unregistered prior image and the current anatomy (based on a subsequent data acquisition) and reconstructs the current anatomical image using a model-based reconstruction approach that includes regularization based on the deformed prior image. This framework is referred to as deformable prior image registration, penalized-likelihood estimation (dPIRPLE). Central to this framework is the inclusion of a 3D B-spline-based free-form-deformation model into the joint registration-reconstruction objective function. The proposed framework is solved using a maximization strategy whereby alternating updates to the registration parameters and image estimates are applied allowing for improvements in both the registration and reconstruction throughout the optimization process. Cadaver experiments were conducted on a cone-beam CT testbench emulating a lung nodule surveillance scenario. Superior reconstruction accuracy and image quality were demonstrated using the dPIRPLE algorithm as compared to more traditional reconstruction methods including filtered backprojection, penalized-likelihood estimation (PLE), prior image penalized-likelihood estimation (PIPLE) without registration, and prior image penalized-likelihood estimation with rigid registration of a prior image (PIRPLE) over a wide range of sampling sparsity and exposure levels.
Likelihood ratios for glaucoma diagnosis using spectral-domain optical coherence tomography.
Lisboa, Renato; Mansouri, Kaweh; Zangwill, Linda M; Weinreb, Robert N; Medeiros, Felipe A
2013-11-01
To present a methodology for calculating likelihood ratios for glaucoma diagnosis for continuous retinal nerve fiber layer (RNFL) thickness measurements from spectral-domain optical coherence tomography (spectral-domain OCT). Observational cohort study. A total of 262 eyes of 187 patients with glaucoma and 190 eyes of 100 control subjects were included in the study. Subjects were recruited from the Diagnostic Innovations Glaucoma Study. Eyes with preperimetric and perimetric glaucomatous damage were included in the glaucoma group. The control group was composed of healthy eyes with normal visual fields from subjects recruited from the general population. All eyes underwent RNFL imaging with Spectralis spectral-domain OCT. Likelihood ratios for glaucoma diagnosis were estimated for specific global RNFL thickness measurements using a methodology based on estimating the tangents to the receiver operating characteristic (ROC) curve. Likelihood ratios could be determined for continuous values of average RNFL thickness. Average RNFL thickness values lower than 86 μm were associated with positive likelihood ratios (ie, likelihood ratios greater than 1), whereas RNFL thickness values higher than 86 μm were associated with negative likelihood ratios (ie, likelihood ratios smaller than 1). A modified Fagan nomogram was provided to assist calculation of posttest probability of disease from the calculated likelihood ratios and pretest probability of disease. The methodology allowed calculation of likelihood ratios for specific RNFL thickness values. By avoiding arbitrary categorization of test results, it potentially allows for an improved integration of test results into diagnostic clinical decision making. Copyright © 2013. Published by Elsevier Inc.
ERIC Educational Resources Information Center
Wothke, Werner; Burket, George; Chen, Li-Sue; Gao, Furong; Shu, Lianghua; Chia, Mike
2011-01-01
It has been known for some time that item response theory (IRT) models may exhibit a likelihood function of a respondent's ability which may have multiple modes, flat modes, or both. These conditions, often associated with guessing of multiple-choice (MC) questions, can introduce uncertainty and bias to ability estimation by maximum likelihood…
F-8C adaptive flight control extensions. [for maximum likelihood estimation
NASA Technical Reports Server (NTRS)
Stein, G.; Hartmann, G. L.
1977-01-01
An adaptive concept which combines gain-scheduled control laws with explicit maximum likelihood estimation (MLE) identification to provide the scheduling values is described. The MLE algorithm was improved by incorporating attitude data, estimating gust statistics for setting filter gains, and improving parameter tracking during changing flight conditions. A lateral MLE algorithm was designed to improve true air speed and angle of attack estimates during lateral maneuvers. Relationships between the pitch axis sensors inherent in the MLE design were examined and used for sensor failure detection. Design details and simulation performance are presented for each of the three areas investigated.
Quantum state estimation when qubits are lost: a no-data-left-behind approach
Williams, Brian P.; Lougovski, Pavel
2017-04-06
We present an approach to Bayesian mean estimation of quantum states using hyperspherical parametrization and an experiment-specific likelihood which allows utilization of all available data, even when qubits are lost. With this method, we report the first closed-form Bayesian mean and maximum likelihood estimates for the ideal single qubit. Due to computational constraints, we utilize numerical sampling to determine the Bayesian mean estimate for a photonic two-qubit experiment in which our novel analysis reduces burdens associated with experimental asymmetries and inefficiencies. This method can be applied to quantum states of any dimension and experimental complexity.
Eisenhauer, Philipp; Heckman, James J.; Mosso, Stefano
2015-01-01
We compare the performance of maximum likelihood (ML) and simulated method of moments (SMM) estimation for dynamic discrete choice models. We construct and estimate a simplified dynamic structural model of education that captures some basic features of educational choices in the United States in the 1980s and early 1990s. We use estimates from our model to simulate a synthetic dataset and assess the ability of ML and SMM to recover the model parameters on this sample. We investigate the performance of alternative tuning parameters for SMM. PMID:26494926
Zhou, Xiang
2017-12-01
Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods-the renowned Haseman-Elston (HE) regression and the recent LD score regression (LDSC)-into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal z -scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.
NASA Astrophysics Data System (ADS)
Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho
2017-03-01
So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.
On the Existence and Uniqueness of JML Estimates for the Partial Credit Model
ERIC Educational Resources Information Center
Bertoli-Barsotti, Lucio
2005-01-01
A necessary and sufficient condition is given in this paper for the existence and uniqueness of the maximum likelihood (the so-called joint maximum likelihood) estimate of the parameters of the Partial Credit Model. This condition is stated in terms of a structural property of the pattern of the data matrix that can be easily verified on the basis…
Use of Bayes theorem to correct size-specific sampling bias in growth data.
Troynikov, V S
1999-03-01
The bayesian decomposition of posterior distribution was used to develop a likelihood function to correct bias in the estimates of population parameters from data collected randomly with size-specific selectivity. Positive distributions with time as a parameter were used for parametrization of growth data. Numerical illustrations are provided. The alternative applications of the likelihood to estimate selectivity parameters are discussed.
ATAC Autocuer Modeling Analysis.
1981-01-01
the analysis of the simple rectangular scrnentation (1) is based on detection and estimation theory (2). This approach uses the concept of maximum ...continuous wave forms. In order to develop the principles of maximum likelihood, it is con- venient to develop the principles for the "classical...the concept of maximum likelihood is significant in that it provides the optimum performance of the detection/estimation problem. With a knowledge of
Heersink, Daniel K; Caley, Peter; Paini, Dean R; Barry, Simon C
2016-05-01
The cost of an uncontrolled incursion of invasive alien species (IAS) arising from undetected entry through ports can be substantial, and knowledge of port-specific risks is needed to help allocate limited surveillance resources. Quantifying the establishment likelihood of such an incursion requires quantifying the ability of a species to enter, establish, and spread. Estimation of the approach rate of IAS into ports provides a measure of likelihood of entry. Data on the approach rate of IAS are typically sparse, and the combinations of risk factors relating to country of origin and port of arrival diverse. This presents challenges to making formal statistical inference on establishment likelihood. Here we demonstrate how these challenges can be overcome with judicious use of mixed-effects models when estimating the incursion likelihood into Australia of the European (Apis mellifera) and Asian (A. cerana) honeybees, along with the invasive parasites of biosecurity concern they host (e.g., Varroa destructor). Our results demonstrate how skewed the establishment likelihood is, with one-tenth of the ports accounting for 80% or more of the likelihood for both species. These results have been utilized by biosecurity agencies in the allocation of resources to the surveillance of maritime ports. © 2015 Society for Risk Analysis.
An evaluation of portion size estimation aids: precision, ease of use and likelihood of future use.
Faulkner, Gemma P; Livingstone, M Barbara E; Pourshahidi, L Kirsty; Spence, Michelle; Dean, Moira; O'Brien, Sinead; Gibney, Eileen R; Wallace, Julie Mw; McCaffrey, Tracy A; Kerr, Maeve A
2016-09-01
The present study aimed to evaluate the precision, ease of use and likelihood of future use of portion size estimation aids (PSEA). A range of PSEA were used to estimate the serving sizes of a range of commonly eaten foods and rated for ease of use and likelihood of future usage. For each food, participants selected their preferred PSEA from a range of options including: quantities and measures; reference objects; measuring; and indicators on food packets. These PSEA were used to serve out various foods (e.g. liquid, amorphous, and composite dishes). Ease of use and likelihood of future use were noted. The foods were weighed to determine the precision of each PSEA. Males and females aged 18-64 years (n 120). The quantities and measures were the most precise PSEA (lowest range of weights for estimated portion sizes). However, participants preferred household measures (e.g. 200 ml disposable cup) - deemed easy to use (median rating of 5), likely to use again in future (all scored either 4 or 5 on a scale from 1='not very likely' to 5='very likely to use again') and precise (narrow range of weights for estimated portion sizes). The majority indicated they would most likely use the PSEA preparing a meal (94 %), particularly dinner (86 %) in the home (89 %; all P<0·001) for amorphous grain foods. Household measures may be precise, easy to use and acceptable aids for estimating the appropriate portion size of amorphous grain foods.
Empirical likelihood inference in randomized clinical trials.
Zhang, Biao
2017-01-01
In individually randomized controlled trials, in addition to the primary outcome, information is often available on a number of covariates prior to randomization. This information is frequently utilized to undertake adjustment for baseline characteristics in order to increase precision of the estimation of average treatment effects; such adjustment is usually performed via covariate adjustment in outcome regression models. Although the use of covariate adjustment is widely seen as desirable for making treatment effect estimates more precise and the corresponding hypothesis tests more powerful, there are considerable concerns that objective inference in randomized clinical trials can potentially be compromised. In this paper, we study an empirical likelihood approach to covariate adjustment and propose two unbiased estimating functions that automatically decouple evaluation of average treatment effects from regression modeling of covariate-outcome relationships. The resulting empirical likelihood estimator of the average treatment effect is as efficient as the existing efficient adjusted estimators 1 when separate treatment-specific working regression models are correctly specified, yet are at least as efficient as the existing efficient adjusted estimators 1 for any given treatment-specific working regression models whether or not they coincide with the true treatment-specific covariate-outcome relationships. We present a simulation study to compare the finite sample performance of various methods along with some results on analysis of a data set from an HIV clinical trial. The simulation results indicate that the proposed empirical likelihood approach is more efficient and powerful than its competitors when the working covariate-outcome relationships by treatment status are misspecified.
Royalty, Anne Beeson
2008-01-01
In recent years the cost of health insurance has been increasing much faster than wages. In the face of these rising costs, many employers will have to make difficult decisions about whether to cut back health benefits or to compensate workers with lower wages or lower wage growth. In this paper, we ask the question, "Which do workers value more -- one additional dollar's worth of health benefits or one more dollar in their pockets?" Using a new approach to obtaining estimates of insured workers' marginal valuation of health benefits this paper estimates how much, on average, employees value the marginal dollar paid by employers for their workers' health insurance. We find that insured workers value the marginal health premium dollar at significantly less than the marginal wage dollar. However, workers value insurance generosity very highly. The marginal dollar spent on health insurance that adds an additional dollar's worth of observable dimensions of plan generosity, such as lower deductibles or coverage of additional services, is valued at significantly more than one dollar.
Development of advanced techniques for rotorcraft state estimation and parameter identification
NASA Technical Reports Server (NTRS)
Hall, W. E., Jr.; Bohn, J. G.; Vincent, J. H.
1980-01-01
An integrated methodology for rotorcraft system identification consists of rotorcraft mathematical modeling, three distinct data processing steps, and a technique for designing inputs to improve the identifiability of the data. These elements are as follows: (1) a Kalman filter smoother algorithm which estimates states and sensor errors from error corrupted data. Gust time histories and statistics may also be estimated; (2) a model structure estimation algorithm for isolating a model which adequately explains the data; (3) a maximum likelihood algorithm for estimating the parameters and estimates for the variance of these estimates; and (4) an input design algorithm, based on a maximum likelihood approach, which provides inputs to improve the accuracy of parameter estimates. Each step is discussed with examples to both flight and simulated data cases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bakhtiari, M; Schmitt, J; Sarfaraz, M
2015-06-15
Purpose: To establish a minimum number of patients required to obtain statistically accurate Planning Target Volume (PTV) margins for prostate Intensity Modulated Radiation Therapy (IMRT). Methods: A total of 320 prostate patients, consisting of a total number of 9311 daily setups, were analyzed. These patients had gone under IMRT treatments. Daily localization was done using the skin marks and the proper shifts were determined by the CBCT to match the prostate gland. The Van Herk formalism is used to obtain the margins using the systematic and random setup variations. The total patient population was divided into different grouping sizes varyingmore » from 1 group of 320 patients to 64 groups of 5 patients. Each grouping was used to determine the average PTV margin and its associated standard deviation. Results: Analyzing all 320 patients lead to an average Superior-Inferior margin of 1.15 cm. The grouping with 10 patients per group (32 groups) resulted to an average PTV margin between 0.6–1.7 cm with the mean value of 1.09 cm and a standard deviation (STD) of 0.30 cm. As the number of patients in groups increases the mean value of average margin between groups tends to converge to the true average PTV of 1.15 cm and STD decreases. For groups of 20, 64, and 160 patients a Superior-Inferior margin of 1.12, 1.14, and 1.16 cm with STD of 0.22, 0.11, and 0.01 cm were found, respectively. Similar tendency was observed for Left-Right and Anterior-Posterior margins. Conclusion: The estimation of the required margin for PTV strongly depends on the number of patients studied. According to this study at least ∼60 patients are needed to calculate a statistically acceptable PTV margin for a criterion of STD < 0.1 cm. Numbers greater than ∼60 patients do little to increase the accuracy of the PTV margin estimation.« less
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
Robust analysis of semiparametric renewal process models
Lin, Feng-Chang; Truong, Young K.; Fine, Jason P.
2013-01-01
Summary A rate model is proposed for a modulated renewal process comprising a single long sequence, where the covariate process may not capture the dependencies in the sequence as in standard intensity models. We consider partial likelihood-based inferences under a semiparametric multiplicative rate model, which has been widely studied in the context of independent and identical data. Under an intensity model, gap times in a single long sequence may be used naively in the partial likelihood with variance estimation utilizing the observed information matrix. Under a rate model, the gap times cannot be treated as independent and studying the partial likelihood is much more challenging. We employ a mixing condition in the application of limit theory for stationary sequences to obtain consistency and asymptotic normality. The estimator's variance is quite complicated owing to the unknown gap times dependence structure. We adapt block bootstrapping and cluster variance estimators to the partial likelihood. Simulation studies and an analysis of a semiparametric extension of a popular model for neural spike train data demonstrate the practical utility of the rate approach in comparison with the intensity approach. PMID:24550568
Gran, Jon Michael; Røysland, Kjetil; Wolbers, Marcel; Didelez, Vanessa; Sterne, Jonathan A C; Ledergerber, Bruno; Furrer, Hansjakob; von Wyl, Viktor; Aalen, Odd O
2010-11-20
When estimating the effect of treatment on HIV using data from observational studies, standard methods may produce biased estimates due to the presence of time-dependent confounders. Such confounding can be present when a covariate, affected by past exposure, is both a predictor of the future exposure and the outcome. One example is the CD4 cell count, being a marker for disease progression for HIV patients, but also a marker for treatment initiation and influenced by treatment. Fitting a marginal structural model (MSM) using inverse probability weights is one way to give appropriate adjustment for this type of confounding. In this paper we study a simple and intuitive approach to estimate similar treatment effects, using observational data to mimic several randomized controlled trials. Each 'trial' is constructed based on individuals starting treatment in a certain time interval. An overall effect estimate for all such trials is found using composite likelihood inference. The method offers an alternative to the use of inverse probability of treatment weights, which is unstable in certain situations. The estimated parameter is not identical to the one of an MSM, it is conditioned on covariate values at the start of each mimicked trial. This allows the study of questions that are not that easily addressed fitting an MSM. The analysis can be performed as a stratified weighted Cox analysis on the joint data set of all the constructed trials, where each trial is one stratum. The model is applied to data from the Swiss HIV cohort study. Copyright © 2010 John Wiley & Sons, Ltd.
SU-E-J-188: Theoretical Estimation of Margin Necessary for Markerless Motion Tracking
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patel, R; Block, A; Harkenrider, M
2015-06-15
Purpose: To estimate the margin necessary to adequately cover the target using markerless motion tracking (MMT) of lung lesions given the uncertainty in tracking and the size of the target. Methods: Simulations were developed in Matlab to determine the effect of tumor size and tracking uncertainty on the margin necessary to achieve adequate coverage of the target. For simplicity, the lung tumor was approximated by a circle on a 2D radiograph. The tumor was varied in size from a diameter of 0.1 − 30 mm in increments of 0.1 mm. From our previous studies using dual energy markerless motion tracking,more » we estimated tracking uncertainties in x and y to have a standard deviation of 2 mm. A Gaussian was used to simulate the deviation between the tracked location and true target location. For each size tumor, 100,000 deviations were randomly generated, the margin necessary to achieve at least 95% coverage 95% of the time was recorded. Additional simulations were run for varying uncertainties to demonstrate the effect of the tracking accuracy on the margin size. Results: The simulations showed an inverse relationship between tumor size and margin necessary to achieve 95% coverage 95% of the time using the MMT technique. The margin decreased exponentially with target size. An increase in tracking accuracy expectedly showed a decrease in margin size as well. Conclusion: In our clinic a 5 mm expansion of the internal target volume (ITV) is used to define the planning target volume (PTV). These simulations show that for tracking accuracies in x and y better than 2 mm, the margin required is less than 5 mm. This simple simulation can provide physicians with a guideline estimation for the margin necessary for use of MMT clinically based on the accuracy of their tracking and the size of the tumor.« less
Preisser, John S; Long, D Leann; Stamm, John W
2017-01-01
Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two data sets, one consisting of fictional dmft counts in 2 groups and the other on DMFS among schoolchildren from a randomized clinical trial comparing 3 toothpaste formulations to prevent incident dental caries, are analyzed with negative binomial hurdle, zero-inflated negative binomial, and marginalized zero-inflated negative binomial models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the randomized clinical trial were similar despite their distinctive interpretations. The choice of statistical model class should match the study's purpose, while accounting for the broad decline in children's caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. © 2017 S. Karger AG, Basel.
Preisser, John S.; Long, D. Leann; Stamm, John W.
2017-01-01
Marginalized zero-inflated count regression models have recently been introduced for the statistical analysis of dental caries indices and other zero-inflated count data as alternatives to traditional zero-inflated and hurdle models. Unlike the standard approaches, the marginalized models directly estimate overall exposure or treatment effects by relating covariates to the marginal mean count. This article discusses model interpretation and model class choice according to the research question being addressed in caries research. Two datasets, one consisting of fictional dmft counts in two groups and the other on DMFS among schoolchildren from a randomized clinical trial (RCT) comparing three toothpaste formulations to prevent incident dental caries, are analysed with negative binomial hurdle (NBH), zero-inflated negative binomial (ZINB), and marginalized zero-inflated negative binomial (MZINB) models. In the first example, estimates of treatment effects vary according to the type of incidence rate ratio (IRR) estimated by the model. Estimates of IRRs in the analysis of the RCT were similar despite their distinctive interpretations. Choice of statistical model class should match the study’s purpose, while accounting for the broad decline in children’s caries experience, such that dmft and DMFS indices more frequently generate zero counts. Marginalized (marginal mean) models for zero-inflated count data should be considered for direct assessment of exposure effects on the marginal mean dental caries count in the presence of high frequencies of zero counts. PMID:28291962
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics.
Arampatzis, Georgios; Katsoulakis, Markos A; Rey-Bellet, Luc
2016-03-14
We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systems with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.
He, Ye; Lin, Huazhen; Tu, Dongsheng
2018-06-04
In this paper, we introduce a single-index threshold Cox proportional hazard model to select and combine biomarkers to identify patients who may be sensitive to a specific treatment. A penalized smoothed partial likelihood is proposed to estimate the parameters in the model. A simple, efficient, and unified algorithm is presented to maximize this likelihood function. The estimators based on this likelihood function are shown to be consistent and asymptotically normal. Under mild conditions, the proposed estimators also achieve the oracle property. The proposed approach is evaluated through simulation analyses and application to the analysis of data from two clinical trials, one involving patients with locally advanced or metastatic pancreatic cancer and one involving patients with resectable lung cancer. Copyright © 2018 John Wiley & Sons, Ltd.
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics
NASA Astrophysics Data System (ADS)
Arampatzis, Georgios; Katsoulakis, Markos A.; Rey-Bellet, Luc
2016-03-01
We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systems with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arampatzis, Georgios; Katsoulakis, Markos A.; Rey-Bellet, Luc
2016-03-14
We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systemsmore » with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.« less
NASA Astrophysics Data System (ADS)
Pan, Zhen; Anderes, Ethan; Knox, Lloyd
2018-05-01
One of the major targets for next-generation cosmic microwave background (CMB) experiments is the detection of the primordial B-mode signal. Planning is under way for Stage-IV experiments that are projected to have instrumental noise small enough to make lensing and foregrounds the dominant source of uncertainty for estimating the tensor-to-scalar ratio r from polarization maps. This makes delensing a crucial part of future CMB polarization science. In this paper we present a likelihood method for estimating the tensor-to-scalar ratio r from CMB polarization observations, which combines the benefits of a full-scale likelihood approach with the tractability of the quadratic delensing technique. This method is a pixel space, all order likelihood analysis of the quadratic delensed B modes, and it essentially builds upon the quadratic delenser by taking into account all order lensing and pixel space anomalies. Its tractability relies on a crucial factorization of the pixel space covariance matrix of the polarization observations which allows one to compute the full Gaussian approximate likelihood profile, as a function of r , at the same computational cost of a single likelihood evaluation.
NASA Astrophysics Data System (ADS)
Nourali, Mahrouz; Ghahraman, Bijan; Pourreza-Bilondi, Mohsen; Davary, Kamran
2016-09-01
In the present study, DREAM(ZS), Differential Evolution Adaptive Metropolis combined with both formal and informal likelihood functions, is used to investigate uncertainty of parameters of the HEC-HMS model in Tamar watershed, Golestan province, Iran. In order to assess the uncertainty of 24 parameters used in HMS, three flood events were used to calibrate and one flood event was used to validate the posterior distributions. Moreover, performance of seven different likelihood functions (L1-L7) was assessed by means of DREAM(ZS)approach. Four likelihood functions, L1-L4, Nash-Sutcliffe (NS) efficiency, Normalized absolute error (NAE), Index of agreement (IOA), and Chiew-McMahon efficiency (CM), is considered as informal, whereas remaining (L5-L7) is represented in formal category. L5 focuses on the relationship between the traditional least squares fitting and the Bayesian inference, and L6, is a hetereoscedastic maximum likelihood error (HMLE) estimator. Finally, in likelihood function L7, serial dependence of residual errors is accounted using a first-order autoregressive (AR) model of the residuals. According to the results, sensitivities of the parameters strongly depend on the likelihood function, and vary for different likelihood functions. Most of the parameters were better defined by formal likelihood functions L5 and L7 and showed a high sensitivity to model performance. Posterior cumulative distributions corresponding to the informal likelihood functions L1, L2, L3, L4 and the formal likelihood function L6 are approximately the same for most of the sub-basins, and these likelihood functions depict almost a similar effect on sensitivity of parameters. 95% total prediction uncertainty bounds bracketed most of the observed data. Considering all the statistical indicators and criteria of uncertainty assessment, including RMSE, KGE, NS, P-factor and R-factor, results showed that DREAM(ZS) algorithm performed better under formal likelihood functions L5 and L7, but likelihood function L5 may result in biased and unreliable estimation of parameters due to violation of the residualerror assumptions. Thus, likelihood function L7 provides posterior distribution of model parameters credibly and therefore can be employed for further applications.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
NASA Astrophysics Data System (ADS)
Trifonov, A. P.; Korchagin, Yu. E.; Korol'kov, S. V.
2018-05-01
We synthesize the quasi-likelihood, maximum-likelihood, and quasioptimal algorithms for estimating the arrival time and duration of a radio signal with unknown amplitude and initial phase. The discrepancies between the hardware and software realizations of the estimation algorithm are shown. The characteristics of the synthesized-algorithm operation efficiency are obtained. Asymptotic expressions for the biases, variances, and the correlation coefficient of the arrival-time and duration estimates, which hold true for large signal-to-noise ratios, are derived. The accuracy losses of the estimates of the radio-signal arrival time and duration because of the a priori ignorance of the amplitude and initial phase are determined.
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
INFERRING THE ECCENTRICITY DISTRIBUTION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogg, David W.; Bovy, Jo; Myers, Adam D., E-mail: david.hogg@nyu.ed
2010-12-20
Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementationmore » of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision-other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts-so long as the measurements have been communicated as a likelihood function or a posterior sampling.« less
Precision Parameter Estimation and Machine Learning
NASA Astrophysics Data System (ADS)
Wandelt, Benjamin D.
2008-12-01
I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, Michael D.; Dawson, William A.; Hogg, David W.
2015-07-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxymore » properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics.« less
Bayesian Recurrent Neural Network for Language Modeling.
Chien, Jen-Tzung; Ku, Yuan-Chu
2016-02-01
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
Jurassic-Cretaceous paleogeography, paleoclimate and upwelling of the northern margin of Tethys
DOE Office of Scientific and Technical Information (OSTI.GOV)
Golonka, J.; Krobicki, M.
The Jurassic and Cretaceous global paleogeographic reconstructions illustrate the changing configuration of mountains, land, shallow seas and deep ocean basins. Active plate boundaries, such as spreading centers and subduction zones, are also shown. The Pliensbachian, Toarcian, Bathonian, Oxfordian-Kimmeridgian, Tithonian-Berriasian, Valanginian, Albian, Turonian and Maastrichtian maps were generated The outlines of paleogeography are used as input for paleoclimatic modeling. The PALEOCLIMATE program models global atmospheric pressure, derive paleo-wind directions and estimate the likelihood of coastal upwelling. The program is based on the paleoclimatic methods first developed by Judith Parrish, adopted by C. R. Scotese and modified by M. I. Ross. Themore » maps depict air pressure, wind directions, humid zones and areas favorable for upwelling conditions plotted on the paleogeographic background. Paleoclimate modeling suggests that prevailing Jurassic-Cretaceous wind directions in the northern Tethys area were from north-northeast. These winds were parallel to the axis of Czorsztyn ridge. The ridge was uplifted between Magura and Pieniny basins as the result of extension during Jurassic supercontinent breakup. The upwelling may have been induced at the southeastern margin of the ridge. The model is consistent with rock records, especially from the upper part of ammonitico rosso type Czorsztyn formation. Mass occurrence of Tithonian and Berriasian brachiopods was probably controlled by upwelling-induced trophic relationships which is resulted in the intense growth of benthic organisms on the ridge. This is additionally supported by the presence of phosphorites at localities which corresponded to the continental shelf/slope transition.« less
Effects of time-shifted data on flight determined stability and control derivatives
NASA Technical Reports Server (NTRS)
Steers, S. T.; Iliff, K. W.
1975-01-01
Flight data were shifted in time by various increments to assess the effects of time shifts on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there was a considerable time shift in the data. Time shifts degraded the estimates of the derivatives, but the degradation was in a consistent rather than a random pattern. Time shifts in the control variables caused the most degradation, and the lateral-directional rotary derivatives were affected the most by time shifts in any variable.
Beyond valence in the perception of likelihood: the role of emotion specificity.
DeSteno, D; Petty, R E; Wegener, D T; Rucker, D D
2000-03-01
Positive and negative moods have been shown to increase likelihood estimates of future events matching these states in valence (e.g., E. J. Johnson & A. Tversky, 1983). In the present article, 4 studies provide evidence that this congruency bias (a) is not limited to valence but functions in an emotion-specific manner, (b) derives from the informational value of emotions, and (c) is not the inevitable outcome of likelihood assessment under heightened emotion. Specifically, Study 1 demonstrates that sadness and anger, 2 distinct, negative emotions, differentially bias likelihood estimates of sad and angering events. Studies 2 and 3 replicate this finding in addition to supporting an emotion-as-information (cf. N. Schwarz & G. L. Clore, 1983), as opposed to a memory-based, mediating process for the bias. Finally, Study 4 shows that when the source of the emotion is salient, a reversal of the bias can occur given greater cognitive effort aimed at accuracy.
Pointwise nonparametric maximum likelihood estimator of stochastically ordered survivor functions
Park, Yongseok; Taylor, Jeremy M. G.; Kalbfleisch, John D.
2012-01-01
In this paper, we consider estimation of survivor functions from groups of observations with right-censored data when the groups are subject to a stochastic ordering constraint. Many methods and algorithms have been proposed to estimate distribution functions under such restrictions, but none have completely satisfactory properties when the observations are censored. We propose a pointwise constrained nonparametric maximum likelihood estimator, which is defined at each time t by the estimates of the survivor functions subject to constraints applied at time t only. We also propose an efficient method to obtain the estimator. The estimator of each constrained survivor function is shown to be nonincreasing in t, and its consistency and asymptotic distribution are established. A simulation study suggests better small and large sample properties than for alternative estimators. An example using prostate cancer data illustrates the method. PMID:23843661
Closed-loop carrier phase synchronization techniques motivated by likelihood functions
NASA Technical Reports Server (NTRS)
Tsou, H.; Hinedi, S.; Simon, M.
1994-01-01
This article reexamines the notion of closed-loop carrier phase synchronization motivated by the theory of maximum a posteriori phase estimation with emphasis on the development of new structures based on both maximum-likelihood and average-likelihood functions. The criterion of performance used for comparison of all the closed-loop structures discussed is the mean-squared phase error for a fixed-loop bandwidth.
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.
dos Reis, Mario; Yang, Ziheng
2011-07-01
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
Lirio, R B; Dondériz, I C; Pérez Abalo, M C
1992-08-01
The methodology of Receiver Operating Characteristic curves based on the signal detection model is extended to evaluate the accuracy of two-stage diagnostic strategies. A computer program is developed for the maximum likelihood estimation of parameters that characterize the sensitivity and specificity of two-stage classifiers according to this extended methodology. Its use is briefly illustrated with data collected in a two-stage screening for auditory defects.
Tamayao, Mili-Ann M; Michalek, Jeremy J; Hendrickson, Chris; Azevedo, Inês M L
2015-07-21
We characterize regionally specific life cycle CO2 emissions per mile traveled for plug-in hybrid electric vehicles (PHEVs) and battery electric vehicles (BEVs) across the United States under alternative assumptions for regional electricity emission factors, regional boundaries, and charging schemes. We find that estimates based on marginal vs average grid emission factors differ by as much as 50% (using National Electricity Reliability Commission (NERC) regional boundaries). Use of state boundaries versus NERC region boundaries results in estimates that differ by as much as 120% for the same location (using average emission factors). We argue that consumption-based marginal emission factors are conceptually appropriate for evaluating the emissions implications of policies that increase electric vehicle sales or use in a region. We also examine generation-based marginal emission factors to assess robustness. Using these two estimates of NERC region marginal emission factors, we find the following: (1) delayed charging (i.e., starting at midnight) leads to higher emissions in most cases due largely to increased coal in the marginal generation mix at night; (2) the Chevrolet Volt has higher expected life cycle emissions than the Toyota Prius hybrid electric vehicle (the most efficient U.S. gasoline vehicle) across the U.S. in nearly all scenarios; (3) the Nissan Leaf BEV has lower life cycle emissions than the Prius in the western U.S. and in Texas, but the Prius has lower emissions in the northern Midwest regardless of assumed charging scheme and marginal emissions estimation method; (4) in other regions the lowest emitting vehicle depends on charge timing and emission factor estimation assumptions.
NASA Astrophysics Data System (ADS)
Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith
2018-01-01
This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.
NASA Technical Reports Server (NTRS)
Iliff, Kenneth W.
1987-01-01
The aircraft parameter estimation problem is used to illustrate the utility of parameter estimation, which applies to many engineering and scientific fields. Maximum likelihood estimation has been used to extract stability and control derivatives from flight data for many years. This paper presents some of the basic concepts of aircraft parameter estimation and briefly surveys the literature in the field. The maximum likelihood estimator is discussed, and the basic concepts of minimization and estimation are examined for a simple simulated aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Some of the major conclusions for the simulated example are also developed for the analysis of flight data from the F-14, highly maneuverable aircraft technology (HiMAT), and space shuttle vehicles.
Estimation of submarine mass failure probability from a sequence of deposits with age dates
Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.
2013-01-01
The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Marginal economic value of streamflow: A case study for the Colorado River Basin
Thomas C. Brown; Benjamin L. Harding; Elizabeth A. Payton
1990-01-01
The marginal economic value of streamflow leaving forested areas in the Colorado River Basin was estimated by determining the impact on water use of a small change in streamflow and then applying economic value estimates to the water use changes. The effect on water use of a change in streamflow was estimated with a network flow model that simulated salinity levels and...
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.
Gil, Manuel
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263
Physician response to financial incentives when choosing drugs to treat breast cancer.
Epstein, Andrew J; Johnson, Scott J
2012-12-01
This paper considers physician agency in choosing drugs to treat metastatic breast cancer, a clinical setting in which patients have few protections from physicians' rent seeking. Physicians have explicit financial incentives attached to each potential drug treatment, with profit margins ranging more than a hundred fold. SEER-Medicare claims and Medispan pricing data were formed into a panel of 4,503 patients who were diagnosed with metastatic breast cancer and treated with anti-cancer drugs from 1992 to 2002. We analyzed the effects of product attributes, including profit margin, randomized controlled trial citations, FDA label, generic status, and other covariates on therapy choice. Instruments and drug fixed effects were used to control for omitted variables and possible measurement error associated with margin. We find that increasing physician margin by 10% yields between an 11 and 177% increase in the likelihood of drug choice on average across drugs. Physicians were more likely to use drugs with which they had experience, had more citations, and were FDA-approved to treat breast cancer. Oncologists are susceptible to financial incentives when choosing drugs, though other factors play a large role in their choice of drug.
Paiva, Thiago da Silva; Shao, Chen; Fernandes, Noemi Mendes; Borges, Bárbara do Nascimento; da Silva-Neto, Inácio Domingos
2016-01-01
Interphase specimens, aspects of physiological reorganization and divisional morphogenesis were investigated in a strain of a hypotrichous ciliate highly similar to Urostyla grandis Ehrenberg, (type species of Urostyla), collected from a mangrove area in the estuary of the Paraíba do Sul river (Rio de Janeiro, Brazil). The results revealed that albeit interphase specimens match with the known morphologic variability in U. grandis, morphogenetic processes have conspicuous differences. Parental adoral zone is entirely renewed during morphogenesis, and marginal cirri exhibit a unique combination of developmental modes, in which left marginal rows originate from multiple anlagen arising from innermost left marginal cirral row, whereas right marginal ciliature originates from individual within-row anlagen. Based on such characteristics, a new subspecies, namely U. grandis wiackowskii subsp. nov. is proposed, and consequently, U. grandis grandis Ehrenberg, stat. nov. is established. Bayesian and maximum-likelihood analyses of the 18S rDNA unambiguously placed U. grandis wiackowskii as adelphotaxon of a cluster formed by other U. grandis sequences. The implications of such findings to the systematics of Urostyla are discussed. © 2015 The Author(s) Journal of Eukaryotic Microbiology © 2015 International Society of Protistologists.
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.
Kang, Xian; Hong, Dennis; Anvari, Mehran; Tiboni, Maria; Amin, Nalin; Gmora, Scott
2017-05-01
Laparoscopic Roux-en-Y gastric bypass (LRYGB) surgery is a safe and effective procedure for patients with severe obesity. One potential complication of LRYGB is the development of marginal ulcers (MUs). Nonsteroidal anti-inflammatory drugs (NSAIDs) are known to significantly increase the likelihood of developing marginal ulcers after surgery. However, the risk associated with low-dose aspirin consumption is not well defined. We examined the impact of daily low-dose aspirin (81 mg) on the development of marginal ulcers following LRYGB. A retrospective cohort design studied patients undergoing LRYGB surgery, between January 2009 and January 2013, at a single, high-volume bariatric center in Ontario, Canada. The marginal ulcer rate of patients taking low-dose aspirin after surgery was compared to that of the control patients who did not take any NSAID. Diagnosis of MU was confirmed by upper endoscopy in patients presenting with symptoms and a history indicative of marginal ulceration. A chi-square test of independence was performed to examine the difference in marginal ulcer rates. A total of 1016 patients underwent LRYGB. Patients taking aspirin were more likely to be male, older, and have diabetes than patients not taking NSAIDs. Of the 1016 patients, 145 (14.3%) took low-dose aspirin following LRYGB and the rest did not (n = 871, 85.7%). The incidence of marginal ulceration was not significantly different between the two treatment groups (12/145, 8.3% versus 90/871, 10.3%; p = 0.45). Patients treated with LRYGB at our institution were not at increased risk of marginal ulcer formation when taking low-dose aspirin after surgery.
Psychometric Properties of IRT Proficiency Estimates
ERIC Educational Resources Information Center
Kolen, Michael J.; Tong, Ye
2010-01-01
Psychometric properties of item response theory proficiency estimates are considered in this paper. Proficiency estimators based on summed scores and pattern scores include non-Bayes maximum likelihood and test characteristic curve estimators and Bayesian estimators. The psychometric properties investigated include reliability, conditional…
Estimating the Cost of Standardized Student Testing in the United States.
ERIC Educational Resources Information Center
Phelps, Richard P.
2000-01-01
Describes and contrasts different methods of estimating costs of standardized testing. Using a cost-accounting approach, compares gross and marginal costs and considers testing objects (test materials and services, personnel and student time, and administrative/building overhead). Social marginal costs of replacing existing tests with a national…
Multivariate Non-Symmetric Stochastic Models for Spatial Dependence Models
NASA Astrophysics Data System (ADS)
Haslauer, C. P.; Bárdossy, A.
2017-12-01
A copula based multivariate framework allows more flexibility to describe different kind of dependences than what is possible using models relying on the confining assumption of symmetric Gaussian models: different quantiles can be modelled with a different degree of dependence; it will be demonstrated how this can be expected given process understanding. maximum likelihood based multivariate quantitative parameter estimation yields stable and reliable results; not only improved results in cross-validation based measures of uncertainty are obtained but also a more realistic spatial structure of uncertainty compared to second order models of dependence; as much information as is available is included in the parameter estimation: incorporation of censored measurements (e.g., below detection limit, or ones that are above the sensitive range of the measurement device) yield to more realistic spatial models; the proportion of true zeros can be jointly estimated with and distinguished from censored measurements which allow estimates about the age of a contaminant in the system; secondary information (categorical and on the rational scale) has been used to improve the estimation of the primary variable; These copula based multivariate statistical techniques are demonstrated based on hydraulic conductivity observations at the Borden (Canada) site, the MADE site (USA), and a large regional groundwater quality data-set in south-west Germany. Fields of spatially distributed K were simulated with identical marginal simulation, identical second order spatial moments, yet substantially differing solute transport characteristics when numerical tracer tests were performed. A statistical methodology is shown that allows the delineation of a boundary layer separating homogenous parts of a spatial data-set. The effects of this boundary layer (macro structure) and the spatial dependence of K (micro structure) on solute transport behaviour is shown.
Quantum-state reconstruction by maximizing likelihood and entropy.
Teo, Yong Siah; Zhu, Huangjun; Englert, Berthold-Georg; Řeháček, Jaroslav; Hradil, Zdeněk
2011-07-08
Quantum-state reconstruction on a finite number of copies of a quantum system with informationally incomplete measurements, as a rule, does not yield a unique result. We derive a reconstruction scheme where both the likelihood and the von Neumann entropy functionals are maximized in order to systematically select the most-likely estimator with the largest entropy, that is, the least-bias estimator, consistent with a given set of measurement data. This is equivalent to the joint consideration of our partial knowledge and ignorance about the ensemble to reconstruct its identity. An interesting structure of such estimators will also be explored.
Likelihoods for fixed rank nomination networks
HOFF, PETER; FOSDICK, BAILEY; VOLFOVSKY, ALEX; STOVEL, KATHERINE
2014-01-01
Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design. PMID:25110586
NASA Astrophysics Data System (ADS)
Park, Jisang
In this dissertation, we investigate MIMO stability margin inference of a large number of controllers using pre-established stability margins of a small number of nu-gap-wise adjacent controllers. The generalized stability margin and the nu-gap metric are inherently able to handle MIMO system analysis without the necessity of repeating multiple channel-by-channel SISO analyses. This research consists of three parts: (i) development of a decision support tool for inference of the stability margin, (ii) computational considerations for yielding the maximal stability margin with the minimal nu-gap metric in a less conservative manner, and (iii) experiment design for estimating the generalized stability margin with an assured error bound. A modern problem from aerospace control involves the certification of a large set of potential controllers with either a single plant or a fleet of potential plant systems, with both plants and controllers being MIMO and, for the moment, linear. Experiments on a limited number of controller/plant pairs should establish the stability and a certain level of margin of the complete set. We consider this certification problem for a set of controllers and provide algorithms for selecting an efficient subset for testing. This is done for a finite set of candidate controllers and, at least for SISO plants, for an infinite set. In doing this, the nu-gap metric will be the main tool. We provide a theorem restricting a radius of a ball in the parameter space so that the controller can guarantee a prescribed level of stability and performance if parameters of the controllers are contained in the ball. Computational examples are given, including one of certification of an aircraft engine controller. The overarching aim is to introduce truly MIMO margin calculations and to understand their efficacy in certifying stability over a set of controllers and in replacing legacy single-loop gain and phase margin calculations. We consider methods for the computation of; maximal MIMO stability margins bP̂,C, minimal nu-gap metrics deltanu , and the maximal difference between these two values, through the use of scaling and weighting functions. We propose simultaneous scaling selections that attempt to maximize the generalized stability margin and minimize the nu-gap. The minimization of the nu-gap by scaling involves a non-convex optimization. We modify the XY-centering algorithm to handle this non-convexity. This is done for applications in controller certification. Estimating the generalized stability margin with an accurate error bound has significant impact on controller certification. We analyze an error bound of the generalized stability margin as the infinity norm of the MIMO empirical transfer function estimate (ETFE). Input signal design to reduce the error on the estimate is also studied. We suggest running the system for a certain amount of time prior to recording of each output data set. The assured upper bound of estimation error can be tuned by the amount of the pre-experiment.
celerite: Scalable 1D Gaussian Processes in C++, Python, and Julia
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel; Agol, Eric; Ambikasaran, Sivaram; Angus, Ruth
2017-09-01
celerite provides fast and scalable Gaussian Process (GP) Regression in one dimension and is implemented in C++, Python, and Julia. The celerite API is designed to be familiar to users of george and, like george, celerite is designed to efficiently evaluate the marginalized likelihood of a dataset under a GP model. This is then be used alongside a non-linear optimization or posterior inference library for the best results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrews, Stephen A.; Sigeti, David E.
These are a set of slides about Bayesian hypothesis testing, where many hypotheses are tested. The conclusions are the following: The value of the Bayes factor obtained when using the median of the posterior marginal is almost the minimum value of the Bayes factor. The value of τ 2 which minimizes the Bayes factor is a reasonable choice for this parameter. This allows a likelihood ratio to be computed with is the least favorable to H 0.
NASA Technical Reports Server (NTRS)
Parrish, R. V.; Steinmetz, G. G.
1972-01-01
A method of parameter extraction for stability and control derivatives of aircraft from flight test data, implementing maximum likelihood estimation, has been developed and successfully applied to actual lateral flight test data from a modern sophisticated jet fighter. This application demonstrates the important role played by the analyst in combining engineering judgment and estimator statistics to yield meaningful results. During the analysis, the problems of uniqueness of the extracted set of parameters and of longitudinal coupling effects were encountered and resolved. The results for all flight runs are presented in tabular form and as time history comparisons between the estimated states and the actual flight test data.
Effect of sampling rate and record length on the determination of stability and control derivatives
NASA Technical Reports Server (NTRS)
Brenner, M. J.; Iliff, K. W.; Whitman, R. K.
1978-01-01
Flight data from five aircraft were used to assess the effects of sampling rate and record length reductions on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there were considerable reductions in sampling rate and/or record length. Small amplitude pulse maneuvers showed greater degradation of the derivative maneuvers than large amplitude pulse maneuvers when these reductions were made. Reducing the sampling rate was found to be more desirable than reducing the record length as a method of lessening the total computation time required without greatly degrading the quantity of the estimates.
Characterization, parameter estimation, and aircraft response statistics of atmospheric turbulence
NASA Technical Reports Server (NTRS)
Mark, W. D.
1981-01-01
A nonGaussian three component model of atmospheric turbulence is postulated that accounts for readily observable features of turbulence velocity records, their autocorrelation functions, and their spectra. Methods for computing probability density functions and mean exceedance rates of a generic aircraft response variable are developed using nonGaussian turbulence characterizations readily extracted from velocity recordings. A maximum likelihood method is developed for optimal estimation of the integral scale and intensity of records possessing von Karman transverse of longitudinal spectra. Formulas for the variances of such parameter estimates are developed. The maximum likelihood and least-square approaches are combined to yield a method for estimating the autocorrelation function parameters of a two component model for turbulence.
Objectively combining AR5 instrumental period and paleoclimate climate sensitivity evidence
NASA Astrophysics Data System (ADS)
Lewis, Nicholas; Grünwald, Peter
2018-03-01
Combining instrumental period evidence regarding equilibrium climate sensitivity with largely independent paleoclimate proxy evidence should enable a more constrained sensitivity estimate to be obtained. Previous, subjective Bayesian approaches involved selection of a prior probability distribution reflecting the investigators' beliefs about climate sensitivity. Here a recently developed approach employing two different statistical methods—objective Bayesian and frequentist likelihood-ratio—is used to combine instrumental period and paleoclimate evidence based on data presented and assessments made in the IPCC Fifth Assessment Report. Probabilistic estimates from each source of evidence are represented by posterior probability density functions (PDFs) of physically-appropriate form that can be uniquely factored into a likelihood function and a noninformative prior distribution. The three-parameter form is shown accurately to fit a wide range of estimated climate sensitivity PDFs. The likelihood functions relating to the probabilistic estimates from the two sources are multiplicatively combined and a prior is derived that is noninformative for inference from the combined evidence. A posterior PDF that incorporates the evidence from both sources is produced using a single-step approach, which avoids the order-dependency that would arise if Bayesian updating were used. Results are compared with an alternative approach using the frequentist signed root likelihood ratio method. Results from these two methods are effectively identical, and provide a 5-95% range for climate sensitivity of 1.1-4.05 K (median 1.87 K).
Maximal likelihood correspondence estimation for face recognition across pose.
Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang
2014-10-01
Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
Elastic thickness estimates at northeast passive margin of North America and its implications
NASA Astrophysics Data System (ADS)
Kumar, R. T. Ratheesh; Maji, Tanmay K.; Kandpal, Suresh Ch; Sengupta, D.; Nair, Rajesh R.
2011-06-01
Global estimates of the elastic thickness (Te) of the structure of passive continental margins show wide and varying results owing to the use of different methodologies. Earlier estimates of the elastic thickness of the North Atlantic passive continental margins that used flexural modelling yielded a Te value of ~20-100 km. Here, we compare these estimates with the Te value obtained using orthonormalized Hermite multitaper recovered isostatic coherence functions. We discuss how Te is correlated with heat flow distribution and depth of necking. The E-W segment in the southern study region comprising Nova Scotia and the Southern Grand Banks show low Te values, while the zones comprising the NE-SW zones, viz., Western Greenland, Labrador, Orphan Basin and the Northern Grand Bank show comparatively high Te values. As expected, Te broadly reflects the depth of the 200-400°C isotherm below the weak surface sediment layer at the time of loading, and at the margins most of the loading occurred during rifting. We infer that these low Te measurements indicate Te frozen into the lithosphere. This could be due to the passive nature of the margin when the loads were emplaced during the continental break-up process at high temperature gradients.
Meyer, Karin; Kirkpatrick, Mark
2005-01-01
Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
A simulation study on Bayesian Ridge regression models for several collinearity levels
NASA Astrophysics Data System (ADS)
Efendi, Achmad; Effrihan
2017-12-01
When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Validation of a Host Response Assay, Septicyte™ LAB, for Discriminating Sepsis from SIRS in the ICU.
Miller Iii, Russell R; Lopansri, Bert K; Burke, John P; Levy, Mitchell; Opal, Steven; Rothman, Richard E; D'Alessio, Franco R; Sidhaye, Venkataramana K; Aggarwal, Neil R; Balk, Robert; Greenberg, Jared A; Yoder, Mark; Patel, Gourang; Gilbert, Emily; Afshar, Majid; Parada, Jorge P; Martin, Greg S; Esper, Annette M; Kempker, Jordan A; Narasimhan, Mangala; Tsegaye, Adey; Hahn, Stella; Mayo, Paul; van der Poll, Tom; Schultz, Marcus J; Scicluna, Brendon P; Klein Klouwenberg, Peter; Rapisarda, Antony; Seldon, Therese A; McHugh, Leo C; Yager, Thomas D; Cermelli, Silvia; Sampson, Dayle; Rothwell, Victoria; Newman, Richard; Bhide, Shruti; Kirk, James T; Navalkar, Krupa; Davis, Roy F; Brandon, Roslyn A; Brandon, Richard B
2018-04-06
A molecular test to distinguish between sepsis and systemic inflammation of non-infectious etiology could potentially have clinical utility. This study evaluated the diagnostic performance of a molecular host response assay (SeptiCyte™ LAB) designed to distinguish between sepsis and non-infectious systemic inflammation in critically ill adults. The study employed a prospective, observational, non-interventional design, and recruited a heterogeneous cohort of adult critical care patients from seven sites in the USA (N=249). An additional group of 198 patients, recruited in the large MARS consortium trial in the Netherlands (clinicaltrials.gov identifier: NCT01905033), was also tested and analyzed, making a grand total of 447 patients in our study. Performance of SeptiCyte™ LAB was compared to retrospective physician diagnosis by a panel of three experts. In receiver operating characteristic curve analysis, SeptiCyte™ LAB had an estimated area under curve of 0.82-0.89 for discriminating sepsis from non-infectious systemic inflammation. The relative likelihood of sepsis versus non-infectious systemic inflammation was found to increase with increasing test score (range 0-10). In a forward logistic regression analysis, the diagnostic performance of the assay was improved only marginally when used in combination with other clinical and laboratory variables including procalcitonin. Performance of the assay was not significantly affected by demographic variables including age, sex, or race/ethnicity. SeptiCyte™ LAB appears to be a promising diagnostic tool to complement physician assessment of infection likelihood in critically ill adult patients with systemic inflammation. Clinical trial registrations available at clinicaltrials.gov, IDs NCT02127502 and NCT01905033.
Radio weak lensing shear measurement in the visibility domain - I. Methodology
NASA Astrophysics Data System (ADS)
Rivi, M.; Miller, L.; Makhathini, S.; Abdalla, F. B.
2016-12-01
The high sensitivity of the new generation of radio telescopes such as the Square Kilometre Array (SKA) will allow cosmological weak lensing measurements at radio wavelengths that are competitive with optical surveys. We present an adaptation to radio data of lensfit, a method for galaxy shape measurement originally developed and used for optical weak lensing surveys. This likelihood method uses an analytical galaxy model and makes a Bayesian marginalization of the likelihood over uninteresting parameters. It has the feature of working directly in the visibility domain, which is the natural approach to adopt with radio interferometer data, avoiding systematics introduced by the imaging process. As a proof of concept, we provide results for visibility simulations of individual galaxies with flux density S ≥ 10 μJy at the phase centre of the proposed SKA1-MID baseline configuration, adopting 12 frequency channels in the band 950-1190 MHz. Weak lensing shear measurements from a population of galaxies with realistic flux and scalelength distributions are obtained after natural gridding of the raw visibilities. Shear measurements are expected to be affected by `noise bias': we estimate the bias in the method as a function of signal-to-noise ratio (SNR). We obtain additive and multiplicative bias values that are comparable to SKA1 requirements for SNR > 18 and SNR > 30, respectively. The multiplicative bias for SNR >10 is comparable to that found in ground-based optical surveys such as CFHTLenS, and we anticipate that similar shear measurement calibration strategies to those used for optical surveys may be used to good effect in the analysis of SKA radio interferometer data.
Hybrid optimization and Bayesian inference techniques for a non-smooth radiation detection problem
Stefanescu, Razvan; Schmidt, Kathleen; Hite, Jason; ...
2016-12-12
In this paper, we propose several algorithms to recover the location and intensity of a radiation source located in a simulated 250 × 180 m block of an urban center based on synthetic measurements. Radioactive decay and detection are Poisson random processes, so we employ likelihood functions based on this distribution. Owing to the domain geometry and the proposed response model, the negative logarithm of the likelihood is only piecewise continuous differentiable, and it has multiple local minima. To address these difficulties, we investigate three hybrid algorithms composed of mixed optimization techniques. For global optimization, we consider simulated annealing, particlemore » swarm, and genetic algorithm, which rely solely on objective function evaluations; that is, they do not evaluate the gradient in the objective function. By employing early stopping criteria for the global optimization methods, a pseudo-optimum point is obtained. This is subsequently utilized as the initial value by the deterministic implicit filtering method, which is able to find local extrema in non-smooth functions, to finish the search in a narrow domain. These new hybrid techniques, combining global optimization and implicit filtering address, difficulties associated with the non-smooth response, and their performances, are shown to significantly decrease the computational time over the global optimization methods. To quantify uncertainties associated with the source location and intensity, we employ the delayed rejection adaptive Metropolis and DiffeRential Evolution Adaptive Metropolis algorithms. Finally, marginal densities of the source properties are obtained, and the means of the chains compare accurately with the estimates produced by the hybrid algorithms.« less
Wavelet Filtering to Reduce Conservatism in Aeroservoelastic Robust Stability Margins
NASA Technical Reports Server (NTRS)
Brenner, Marty; Lind, Rick
1998-01-01
Wavelet analysis for filtering and system identification was used to improve the estimation of aeroservoelastic stability margins. The conservatism of the robust stability margins was reduced with parametric and nonparametric time-frequency analysis of flight data in the model validation process. Nonparametric wavelet processing of data was used to reduce the effects of external desirableness and unmodeled dynamics. Parametric estimates of modal stability were also extracted using the wavelet transform. Computation of robust stability margins for stability boundary prediction depends on uncertainty descriptions derived from the data for model validation. F-18 high Alpha Research Vehicle aeroservoelastic flight test data demonstrated improved robust stability prediction by extension of the stability boundary beyond the flight regime.
NASA Technical Reports Server (NTRS)
Klein, V.
1980-01-01
A frequency domain maximum likelihood method is developed for the estimation of airplane stability and control parameters from measured data. The model of an airplane is represented by a discrete-type steady state Kalman filter with time variables replaced by their Fourier series expansions. The likelihood function of innovations is formulated, and by its maximization with respect to unknown parameters the estimation algorithm is obtained. This algorithm is then simplified to the output error estimation method with the data in the form of transformed time histories, frequency response curves, or spectral and cross-spectral densities. The development is followed by a discussion on the equivalence of the cost function in the time and frequency domains, and on advantages and disadvantages of the frequency domain approach. The algorithm developed is applied in four examples to the estimation of longitudinal parameters of a general aviation airplane using computer generated and measured data in turbulent and still air. The cost functions in the time and frequency domains are shown to be equivalent; therefore, both approaches are complementary and not contradictory. Despite some computational advantages of parameter estimation in the frequency domain, this approach is limited to linear equations of motion with constant coefficients.
Deletion Diagnostics for Alternating Logistic Regressions
Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.
2013-01-01
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960
Exponential series approaches for nonparametric graphical models
NASA Astrophysics Data System (ADS)
Janofsky, Eric
Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model
Xu, Bo; Yang, Ziheng
2016-01-01
The multispecies coalescent (MSC) model has emerged as a powerful framework for inferring species phylogenies while accounting for ancestral polymorphism and gene tree-species tree conflict. A number of methods have been developed in the past few years to estimate the species tree under the MSC. The full likelihood methods (including maximum likelihood and Bayesian inference) average over the unknown gene trees and accommodate their uncertainties properly but involve intensive computation. The approximate or summary coalescent methods are computationally fast and are applicable to genomic datasets with thousands of loci, but do not make an efficient use of information in the multilocus data. Most of them take the two-step approach of reconstructing the gene trees for multiple loci by phylogenetic methods and then treating the estimated gene trees as observed data, without accounting for their uncertainties appropriately. In this article we review the statistical nature of the species tree estimation problem under the MSC, and explore the conceptual issues and challenges of species tree estimation by focusing mainly on simple cases of three or four closely related species. We use mathematical analysis and computer simulation to demonstrate that large differences in statistical performance may exist between the two classes of methods. We illustrate that several counterintuitive behaviors may occur with the summary methods but they are due to inefficient use of information in the data by summary methods and vanish when the data are analyzed using full-likelihood methods. These include (i) unidentifiability of parameters in the model, (ii) inconsistency in the so-called anomaly zone, (iii) singularity on the likelihood surface, and (iv) deterioration of performance upon addition of more data. We discuss the challenges and strategies of species tree inference for distantly related species when the molecular clock is violated, and highlight the need for improving the computational efficiency and model realism of the likelihood methods as well as the statistical efficiency of the summary methods. PMID:27927902
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics
Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.
2015-01-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564
RGB-D SLAM Combining Visual Odometry and Extended Information Filter
Zhang, Heng; Liu, Yanli; Tan, Jindong; Xiong, Naixue
2015-01-01
In this paper, we present a novel RGB-D SLAM system based on visual odometry and an extended information filter, which does not require any other sensors or odometry. In contrast to the graph optimization approaches, this is more suitable for online applications. A visual dead reckoning algorithm based on visual residuals is devised, which is used to estimate motion control input. In addition, we use a novel descriptor called binary robust appearance and normals descriptor (BRAND) to extract features from the RGB-D frame and use them as landmarks. Furthermore, considering both the 3D positions and the BRAND descriptors of the landmarks, our observation model avoids explicit data association between the observations and the map by marginalizing the observation likelihood over all possible associations. Experimental validation is provided, which compares the proposed RGB-D SLAM algorithm with just RGB-D visual odometry and a graph-based RGB-D SLAM algorithm using the publicly-available RGB-D dataset. The results of the experiments demonstrate that our system is quicker than the graph-based RGB-D SLAM algorithm. PMID:26263990
Modeling Bivariate Longitudinal Hormone Profiles by Hierarchical State Space Models
Liu, Ziyue; Cappola, Anne R.; Crofford, Leslie J.; Guo, Wensheng
2013-01-01
The hypothalamic-pituitary-adrenal (HPA) axis is crucial in coping with stress and maintaining homeostasis. Hormones produced by the HPA axis exhibit both complex univariate longitudinal profiles and complex relationships among different hormones. Consequently, modeling these multivariate longitudinal hormone profiles is a challenging task. In this paper, we propose a bivariate hierarchical state space model, in which each hormone profile is modeled by a hierarchical state space model, with both population-average and subject-specific components. The bivariate model is constructed by concatenating the univariate models based on the hypothesized relationship. Because of the flexible framework of state space form, the resultant models not only can handle complex individual profiles, but also can incorporate complex relationships between two hormones, including both concurrent and feedback relationship. Estimation and inference are based on marginal likelihood and posterior means and variances. Computationally efficient Kalman filtering and smoothing algorithms are used for implementation. Application of the proposed method to a study of chronic fatigue syndrome and fibromyalgia reveals that the relationships between adrenocorticotropic hormone and cortisol in the patient group are weaker than in healthy controls. PMID:24729646
Modeling Bivariate Longitudinal Hormone Profiles by Hierarchical State Space Models.
Liu, Ziyue; Cappola, Anne R; Crofford, Leslie J; Guo, Wensheng
2014-01-01
The hypothalamic-pituitary-adrenal (HPA) axis is crucial in coping with stress and maintaining homeostasis. Hormones produced by the HPA axis exhibit both complex univariate longitudinal profiles and complex relationships among different hormones. Consequently, modeling these multivariate longitudinal hormone profiles is a challenging task. In this paper, we propose a bivariate hierarchical state space model, in which each hormone profile is modeled by a hierarchical state space model, with both population-average and subject-specific components. The bivariate model is constructed by concatenating the univariate models based on the hypothesized relationship. Because of the flexible framework of state space form, the resultant models not only can handle complex individual profiles, but also can incorporate complex relationships between two hormones, including both concurrent and feedback relationship. Estimation and inference are based on marginal likelihood and posterior means and variances. Computationally efficient Kalman filtering and smoothing algorithms are used for implementation. Application of the proposed method to a study of chronic fatigue syndrome and fibromyalgia reveals that the relationships between adrenocorticotropic hormone and cortisol in the patient group are weaker than in healthy controls.
BCM: toolkit for Bayesian analysis of Computational Models using samplers.
Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A
2016-10-21
Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
A Copula-Based Conditional Probabilistic Forecast Model for Wind Power Ramps
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hodge, Brian S; Krishnan, Venkat K; Zhang, Jie
Efficient management of wind ramping characteristics can significantly reduce wind integration costs for balancing authorities. By considering the stochastic dependence of wind power ramp (WPR) features, this paper develops a conditional probabilistic wind power ramp forecast (cp-WPRF) model based on Copula theory. The WPRs dataset is constructed by extracting ramps from a large dataset of historical wind power. Each WPR feature (e.g., rate, magnitude, duration, and start-time) is separately forecasted by considering the coupling effects among different ramp features. To accurately model the marginal distributions with a copula, a Gaussian mixture model (GMM) is adopted to characterize the WPR uncertaintymore » and features. The Canonical Maximum Likelihood (CML) method is used to estimate parameters of the multivariable copula. The optimal copula model is chosen based on the Bayesian information criterion (BIC) from each copula family. Finally, the best conditions based cp-WPRF model is determined by predictive interval (PI) based evaluation metrics. Numerical simulations on publicly available wind power data show that the developed copula-based cp-WPRF model can predict WPRs with a high level of reliability and sharpness.« less
A case study of potential human health impacts from petroleum coke transfer facilities.
Dourson, Michael L; Chinkin, Lyle R; MacIntosh, David L; Finn, Jennifer A; Brown, Kathleen W; Reid, Stephen B; Martinez, Jeanelle M
2016-11-01
Petroleum coke or "petcoke" is a solid material created during petroleum refinement and is distributed via transfer facilities that may be located in densely populated areas. The health impacts from petcoke exposure to residents living in proximity to such facilities were evaluated for a petcoke transfer facilities located in Chicago, Illinois. Site-specific, margin of safety (MOS) and margin of exposure (MOE) analyses were conducted using estimated airborne and dermal exposures. The exposure assessment was based on a combined measurement and modeling program that included multiyear on-site air monitoring, air dispersion modeling, and analyses of soil and surfaces in residential areas adjacent to two petcoke transfer facilities located in industrial areas. Airborne particulate matter less than 10 microns (PM 10 ) were used as a marker for petcoke. Based on daily fence line monitoring, the average daily PM 10 concentration at the KCBX Terminals measured on-site was 32 μg/m 3 , with 89% of 24-hr average PM 10 concentrations below 50 μg/m 3 and 99% below 100 μg/m 3 . A dispersion model estimated that the emission sources at the KCBX Terminals produced peak PM 10 levels attributed to the petcoke facility at the most highly impacted residence of 11 μg/m 3 on an annual average basis and 54 μg/m 3 on 24-hr average basis. Chemical indicators of petcoke in soil and surface samples collected from residential neighborhoods adjacent to the facilities were equivalent to levels in corresponding samples collected at reference locations elsewhere in Chicago, a finding that is consistent with limited potential for off-site exposure indicated by the fence line monitoring and air dispersion modeling. The MOE based upon dispersion model estimates ranged from 800 to 900 for potential inhalation, the primary route of concern for particulate matter. This indicates a low likelihood of adverse health effects in the surrounding community. Implications: Handling of petroleum coke at bulk material transfer facilities has been identified as a concern for the public health of surrounding populations. The current assessment, based on measurements and modeling of two facilities located in a densely populated urban area, indicates that petcoke transport and accumulation in off-site locations is minimal. In addition, estimated human exposures, if any, are well below levels that could be anticipated to produce adverse health effects in the general population.
Holben, David H; Taylor, Christopher A
2015-09-01
Food insecurity is a preventable health threat and may precipitate central obesity and metabolic syndrome in children and adolescents in the United States. To examine (1) health by household food security status; and (2) differences and prevalence of central obesity among persons aged 12 to 18 years in the United States. The National Health and Nutrition Examination Survey was administered to a cross-sectional sample of persons aged 12 to 18 years in 1999 to 2006. Controlling for age, race/ethnicity, and sex differences in mean obesity and chronic disease factors across levels of food insecurity (analysis of covariance [Bonferroni post hoc] and ORs [logistic regression analyses]) were examined, as were differences in the rates of risk factors (χ(2) statistics). A total of 7435 participants were analyzed. Those from marginally food secure (n=751) and low-food secure (n=1206) (population size estimate, 26,714,182) households were significantly more likely than their high-food secure counterparts (n=4831) to be overweight (P=.036) (OR, 1.44), and those from marginally food secure households were 1.3-times more likely to be obese (P=.036). Nearly 25% of respondents from marginally food secure, low-food secure, and very low-food secure (n=647) households reported central obesity (P=.002), which was 1.4 to 1.5 times more likely than those from high-food secure households. Participants from high-food secure households had significantly higher mean high-density lipoprotein values (P=.019). Risk factors indicative of metabolic syndrome were present in 3.1%. Household food insecurity was associated with an increased likelihood of being overweight and having central obesity. Limitations included the use of cross-sectional data and some self-reported data and the inability to control for all moderating variables in obesity and overall health status.
2015-01-01
This research has the purpose to establish a foundation for new classification and estimation of CDMA signals. Keywords: DS / CDMA signals, BPSK, QPSK...DEVELOPMENT OF THE AVERAGE LIKELIHOOD FUNCTION FOR CODE DIVISION MULTIPLE ACCESS ( CDMA ) USING BPSK AND QPSK SYMBOLS JANUARY 2015...To) OCT 2013 – OCT 2014 4. TITLE AND SUBTITLE DEVELOPMENT OF THE AVERAGE LIKELIHOOD FUNCTION FOR CODE DIVISION MULTIPLE ACCESS ( CDMA ) USING BPSK
Estimation of brood and nest survival: Comparative methods in the presence of heterogeneity
Manly, Bryan F.J.; Schmutz, Joel A.
2001-01-01
The Mayfield method has been widely used for estimating survival of nests and young animals, especially when data are collected at irregular observation intervals. However, this method assumes survival is constant throughout the study period, which often ignores biologically relevant variation and may lead to biased survival estimates. We examined the bias and accuracy of 1 modification to the Mayfield method that allows for temporal variation in survival, and we developed and similarly tested 2 additional methods. One of these 2 new methods is simply an iterative extension of Klett and Johnson's method, which we refer to as the Iterative Mayfield method and bears similarity to Kaplan-Meier methods. The other method uses maximum likelihood techniques for estimation and is best applied to survival of animals in groups or families, rather than as independent individuals. We also examined how robust these estimators are to heterogeneity in the data, which can arise from such sources as dependent survival probabilities among siblings, inherent differences among families, and adoption. Testing of estimator performance with respect to bias, accuracy, and heterogeneity was done using simulations that mimicked a study of survival of emperor goose (Chen canagica) goslings. Assuming constant survival for inappropriately long periods of time or use of Klett and Johnson's methods resulted in large bias or poor accuracy (often >5% bias or root mean square error) compared to our Iterative Mayfield or maximum likelihood methods. Overall, estimator performance was slightly better with our Iterative Mayfield than our maximum likelihood method, but the maximum likelihood method provides a more rigorous framework for testing covariates and explicity models a heterogeneity factor. We demonstrated use of all estimators with data from emperor goose goslings. We advocate that future studies use the new methods outlined here rather than the traditional Mayfield method or its previous modifications.
Kendall, W.L.; Nichols, J.D.; Hines, J.E.
1997-01-01
Statistical inference for capture-recapture studies of open animal populations typically relies on the assumption that all emigration from the studied population is permanent. However, there are many instances in which this assumption is unlikely to be met. We define two general models for the process of temporary emigration, completely random and Markovian. We then consider effects of these two types of temporary emigration on Jolly-Seber (Seber 1982) estimators and on estimators arising from the full-likelihood approach of Kendall et al. (1995) to robust design data. Capture-recapture data arising from Pollock's (1982) robust design provide the basis for obtaining unbiased estimates of demographic parameters in the presence of temporary emigration and for estimating the probability of temporary emigration. We present a likelihood-based approach to dealing with temporary emigration that permits estimation under different models of temporary emigration and yields tests for completely random and Markovian emigration. In addition, we use the relationship between capture probability estimates based on closed and open models under completely random temporary emigration to derive three ad hoc estimators for the probability of temporary emigration, two of which should be especially useful in situations where capture probabilities are heterogeneous among individual animals. Ad hoc and full-likelihood estimators are illustrated for small mammal capture-recapture data sets. We believe that these models and estimators will be useful for testing hypotheses about the process of temporary emigration, for estimating demographic parameters in the presence of temporary emigration, and for estimating probabilities of temporary emigration. These latter estimates are frequently of ecological interest as indicators of animal movement and, in some sampling situations, as direct estimates of breeding probabilities and proportions.
2015-08-01
McCullagh, P.; Nelder, J.A. Generalized Linear Model , 2nd ed.; Chapman and Hall: London, 1989. 7. Johnston, J. Econometric Methods, 3rd ed.; McGraw...FOR A DOSE-RESPONSE MODEL ECBC-TN-068 Kyong H. Park Steven J. Lagan RESEARCH AND TECHNOLOGY DIRECTORATE August 2015 Approved for public release...Likelihood Estimation Method for Completely Separated and Quasi-Completely Separated Data for a Dose-Response Model 5a. CONTRACT NUMBER 5b. GRANT
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.
Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram
2017-02-01
In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Dores, C B; Milovancev, M; Russell, D S
2018-03-01
Radial sections are widely used to estimate adequacy of excision in canine cutaneous mast cell tumours (MCTs); however, this sectioning technique estimates only a small fraction of total margin circumference. This study aimed to compare histologic margin status in grade II/low grade MCTs sectioned using both radial and tangential sectioning techniques. A total of 43 circumferential margins were evaluated from 21 different tumours. Margins were first sectioned radially, followed by tangential sections. Tissues were examined by routine histopathology. Tangential margin status differed in 10 of 43 (23.3%) margins compared with their initial status on radial section. Of 39 margins, 9 (23.1%) categorized as histologic tumour-free margin (HTFM) >0 mm were positive on tangential sectioning. Tangential sections detected a significantly higher proportion of positive margins relative to radial sections (exact 2-tailed P-value = .0215). The HTFM was significantly longer in negative tangential margins than positive tangential margins (mean 10.1 vs 3.2 mm; P = .0008). A receiver operating characteristic curve comparing HTFM and tangentially negative margins found an area under the curve of 0.83 (95% confidence interval: 0.71-0.96). Although correct classification peaked at the sixth cut-point of HTFM ≥1 mm, radial sections still incorrectly classified 50% of margins as lacking tumour cells. Radial sections had 100% specificity for predicting negative tangential margins at a cut-point of 10.9 mm. These data indicate that for low grade MCTs, HTFMs >0 mm should not be considered completely excised, particularly when HTFM is <10.9 mm. This will inform future studies that use HTFM and overall excisional status as dependent variables in multivariable prognostic models. © 2017 John Wiley & Sons Ltd.
A Game Theoretical Approach to Hacktivism: Is Attack Likelihood a Product of Risks and Payoffs?
Bodford, Jessica E; Kwan, Virginia S Y
2018-02-01
The current study examines hacktivism (i.e., hacking to convey a moral, ethical, or social justice message) through a general game theoretic framework-that is, as a product of costs and benefits. Given the inherent risk of carrying out a hacktivist attack (e.g., legal action, imprisonment), it would be rational for the user to weigh these risks against perceived benefits of carrying out the attack. As such, we examined computer science students' estimations of risks, payoffs, and attack likelihood through a game theoretic design. Furthermore, this study aims at constructing a descriptive profile of potential hacktivists, exploring two predicted covariates of attack decision making, namely, peer prevalence of hacking and sex differences. Contrary to expectations, results suggest that participants' estimations of attack likelihood stemmed solely from expected payoffs, rather than subjective risks. Peer prevalence significantly predicted increased payoffs and attack likelihood, suggesting an underlying descriptive norm in social networks. Notably, we observed no sex differences in the decision to attack, nor in the factors predicting attack likelihood. Implications for policymakers and the understanding and prevention of hacktivism are discussed, as are the possible ramifications of widely communicated payoffs over potential risks in hacking communities.
The Inverse Problem for Confined Aquifer Flow: Identification and Estimation With Extensions
NASA Astrophysics Data System (ADS)
Loaiciga, Hugo A.; MariñO, Miguel A.
1987-01-01
The contributions of this work are twofold. First, a methodology for estimating the elements of parameter matrices in the governing equation of flow in a confined aquifer is developed. The estimation techniques for the distributed-parameter inverse problem pertain to linear least squares and generalized least squares methods. The linear relationship among the known heads and unknown parameters of the flow equation provides the background for developing criteria for determining the identifiability status of unknown parameters. Under conditions of exact or overidentification it is possible to develop statistically consistent parameter estimators and their asymptotic distributions. The estimation techniques, namely, two-stage least squares and three stage least squares, are applied to a specific groundwater inverse problem and compared between themselves and with an ordinary least squares estimator. The three-stage estimator provides the closer approximation to the actual parameter values, but it also shows relatively large standard errors as compared to the ordinary and two-stage estimators. The estimation techniques provide the parameter matrices required to simulate the unsteady groundwater flow equation. Second, a nonlinear maximum likelihood estimation approach to the inverse problem is presented. The statistical properties of maximum likelihood estimators are derived, and a procedure to construct confidence intervals and do hypothesis testing is given. The relative merits of the linear and maximum likelihood estimators are analyzed. Other topics relevant to the identification and estimation methodologies, i.e., a continuous-time solution to the flow equation, coping with noise-corrupted head measurements, and extension of the developed theory to nonlinear cases are also discussed. A simulation study is used to evaluate the methods developed in this study.
Scanning linear estimation: improvements over region of interest (ROI) methods
NASA Astrophysics Data System (ADS)
Kupinski, Meredith K.; Clarkson, Eric W.; Barrett, Harrison H.
2013-03-01
In tomographic medical imaging, a signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is unbiased, i.e. the average estimate equals the true value. By contrast, unpredictable bias arising from the null functions of the imaging system affect standard algorithms that operate on reconstructed data. The SL method is demonstrated for two different tasks: (1) simultaneously estimating a signal’s size, location and activity; (2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution small-animal SPECT imaging system. For both tasks, the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and maximum values within the region of interest (ROI) are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor the response to therapy, the activity estimation task is repeated for three different signal sizes.
Schaberg, Kurt B; Evans, Mark F; Wilcox, Rebecca; Lewis, Michael R
2015-12-01
Helicobacter pylori status influences the prognosis and management of gastric extranodal marginal zone lymphoma of mucosa-associated lymphoid tissue (MALT lymphoma), so accurate determination of H pylori status is of clinical importance. The low rate of histologic H pylori positivity among gastric MALT lymphoma cases at our institution prompted investigation for possible causes. A case series of 24 patients as having gastric MALT lymphoma (with no diffuse large B-cell component) in a tertiary care setting between 1997 and 2010 was identified, and clinical records were reviewed. Immunohistochemical staining for H pylori and BCL10 was performed. This study received institutional review board approval (protocol number M13-033). Thirty-nine percent of cases (9/23) were H pylori positive by histology, and 4 additional patients had positive serologic results; overall, 57% of cases (13/23) were positive for H pylori. Treatment with antisecretory medications was associated with a lower likelihood of histologic positivity (13% among treated patients vs 75% among untreated; P = .04). Nuclear localization of BCL10 was seen in 2 cases and was not associated with H pylori status. Antisecretory medications decrease the likelihood of histologic detection of H pylori in gastric MALT lymphoma cases. Incorporation of results of serologic or other testing is needed to ensure correct classification with respect to H pylori status. Copyright © 2015 Elsevier Inc. All rights reserved.
Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood
Bondell, Howard D.; Stefanski, Leonard A.
2013-01-01
Large- and finite-sample efficiency and resistance to outliers are the key goals of robust statistics. Although often not simultaneously attainable, we develop and study a linear regression estimator that comes close. Efficiency obtains from the estimator’s close connection to generalized empirical likelihood, and its favorable robustness properties are obtained by constraining the associated sum of (weighted) squared residuals. We prove maximum attainable finite-sample replacement breakdown point, and full asymptotic efficiency for normal errors. Simulation evidence shows that compared to existing robust regression estimators, the new estimator has relatively high efficiency for small sample sizes, and comparable outlier resistance. The estimator is further illustrated and compared to existing methods via application to a real data set with purported outliers. PMID:23976805
NASA Technical Reports Server (NTRS)
Grove, R. D.; Mayhew, S. C.
1973-01-01
A computer program (Langley program C1123) has been developed for estimating aircraft stability and control parameters from flight test data. These parameters are estimated by the maximum likelihood estimation procedure implemented on a real-time digital simulation system, which uses the Control Data 6600 computer. This system allows the investigator to interact with the program in order to obtain satisfactory results. Part of this system, the control and display capabilities, is described for this program. This report also describes the computer program by presenting the program variables, subroutines, flow charts, listings, and operational features. Program usage is demonstrated with a test case using pseudo or simulated flight data.
Zee, Jarcy; Xie, Sharon X.
2015-01-01
Summary When a true survival endpoint cannot be assessed for some subjects, an alternative endpoint that measures the true endpoint with error may be collected, which often occurs when obtaining the true endpoint is too invasive or costly. We develop an estimated likelihood function for the situation where we have both uncertain endpoints for all participants and true endpoints for only a subset of participants. We propose a nonparametric maximum estimated likelihood estimator of the discrete survival function of time to the true endpoint. We show that the proposed estimator is consistent and asymptotically normal. We demonstrate through extensive simulations that the proposed estimator has little bias compared to the naïve Kaplan-Meier survival function estimator, which uses only uncertain endpoints, and more efficient with moderate missingness compared to the complete-case Kaplan-Meier survival function estimator, which uses only available true endpoints. Finally, we apply the proposed method to a dataset for estimating the risk of developing Alzheimer's disease from the Alzheimer's Disease Neuroimaging Initiative. PMID:25916510
Maximum likelihood phase-retrieval algorithm: applications.
Nahrstedt, D A; Southwell, W H
1984-12-01
The maximum likelihood estimator approach is shown to be effective in determining the wave front aberration in systems involving laser and flow field diagnostics and optical testing. The robustness of the algorithm enables convergence even in cases of severe wave front error and real, nonsymmetrical, obscured amplitude distributions.
Vicini, C; Montevecchi, F; D'Agostino, G; DE Vito, A; Meccariello, G
2015-06-01
The primary goal of surgical oncology is to obtain a tumour resection with disease-free margins. Transoral robotic surgery (TORS) for surgical treatment of head-neck cancer is commensurate with standard treatments. However, the likelihood of positive margins after TORS is up to 20.2% in a recent US survey. The aim of this study is to evaluate the efficacy and the feasibility of narrow-band imaging (NBI) during TORS in order to improve the ability to achieve disease-free margins during tumour excision. The present study was conducted at the ENT, Head- Neck Surgery and Oral Surgery Unit, Department of Special Surgery, Morgagni Pierantoni Hospital, Azienda USL Romagna. From March 2008 to January 2015, 333 TORS were carried out for malignant and benign diseases. For the present study, we retrospectively evaluated 58 biopsy-proven squamous cell carcinoma patients who underwent TORS procedures. Patients were divided into 2 groups: (1) 32 who underwent TORS and intra-operative NBI evaluation (NBI-TORS); (2) 21 who underwent TORS with standard intra-operative white-light imaging (WLITORS). Frozen section analysis of margins on surgical specimens showed a higher rate of negative superficial lateral margins in the NBI-TORS group compared with the WLI-TORS group (87.9% vs. 57.9%, respectively, p = 0.02). The sensitivity and specificity of intra-operative use of NBI, respectively, were 72.5% and 66.7% with a negative predictive value of 87.9%. Tumour margin enhancement provided by NBI associated with magnification and 3-dimensional view of the surgical field might increase the capability to achieve an oncologically-safe resection in challenging anatomical areas where minimal curative resection is strongly recommended for function preservation.
Cramer-Rao Bound, MUSIC, and Maximum Likelihood. Effects of Temporal Phase Difference
1990-11-01
Technical Report 1373 November 1990 Cramer-Rao Bound, MUSIC , And Maximum Likelihood Effects of Temporal Phase o Difference C. V. TranI OTIC Approved... MUSIC , and Maximum Likelihood (ML) asymptotic variances corresponding to the two-source direction-of-arrival estimation where sources were modeled as...1pI = 1.00, SNR = 20 dB ..................................... 27 2. MUSIC for two equipowered signals impinging on a 5-element ULA (a) IpI = 0.50, SNR
Fojas, Christina L; Kim, Jieun; Minsky-Rowland, Jocelyn D; Algee-Hewitt, Bridget F B
2018-01-01
Skeletal age estimation is an integral part of the biological profile. Recent work shows how multiple-trait approaches better capture senescence as it occurs at different rates among individuals. Furthermore, a Bayesian statistical framework of analysis provides more useful age estimates. The component-scoring method of Transition Analysis (TA) may resolve many of the functional and statistical limitations of traditional phase-aging methods and is applicable to both paleodemography and forensic casework. The present study contributes to TA-research by validating TA for multiple, differently experienced observers using a collection of modern forensic skeletal cases. Five researchers independently applied TA to a random sample of 58 documented individuals from the William M. Bass Forensic Skeletal Collection, for whom knowledge of chronological age was withheld. Resulting scores were input into the ADBOU software and maximum likelihood estimates (MLEs) and 95% confidence intervals (CIs) were produced using the forensic prior. Krippendorff's alpha was used to evaluate interrater reliability and agreement. Inaccuracy and bias were measured to gauge the magnitude and direction of difference between estimated ages and chronological ages among the five observers. The majority of traits had moderate to excellent agreement among observers (≥0.6). The superior surface morphology had the least congruence (0.4), while the ventral symphyseal margin had the most (0.9) among scores. Inaccuracy was the lowest for individuals younger than 30 and the greatest for individuals over 60. Consistent over-estimation of individuals younger than 30 and under-estimation of individuals over 40 years old occurred. Individuals in their 30s showed a mixed pattern of under- and over-estimation among observers. These results support the use of the TA method by researchers of varying experience levels. Further, they validate its use on forensic cases, given the low error overall. © 2017 Wiley Periodicals, Inc.
Bayesian experimental design for models with intractable likelihoods.
Drovandi, Christopher C; Pettitt, Anthony N
2013-12-01
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
Margin and sensitivity methods for security analysis of electric power systems
NASA Astrophysics Data System (ADS)
Greene, Scott L.
Reliable operation of large scale electric power networks requires that system voltages and currents stay within design limits. Operation beyond those limits can lead to equipment failures and blackouts. Security margins measure the amount by which system loads or power transfers can change before a security violation, such as an overloaded transmission line, is encountered. This thesis shows how to efficiently compute security margins defined by limiting events and instabilities, and the sensitivity of those margins with respect to assumptions, system parameters, operating policy, and transactions. Security margins to voltage collapse blackouts, oscillatory instability, generator limits, voltage constraints and line overloads are considered. The usefulness of computing the sensitivities of these margins with respect to interarea transfers, loading parameters, generator dispatch, transmission line parameters, and VAR support is established for networks as large as 1500 buses. The sensitivity formulas presented apply to a range of power system models. Conventional sensitivity formulas such as line distribution factors, outage distribution factors, participation factors and penalty factors are shown to be special cases of the general sensitivity formulas derived in this thesis. The sensitivity formulas readily accommodate sparse matrix techniques. Margin sensitivity methods are shown to work effectively for avoiding voltage collapse blackouts caused by either saddle node bifurcation of equilibria or immediate instability due to generator reactive power limits. Extremely fast contingency analysis for voltage collapse can be implemented with margin sensitivity based rankings. Interarea transfer can be limited by voltage limits, line limits, or voltage stability. The sensitivity formulas presented in this thesis apply to security margins defined by any limit criteria. A method to compute transfer margins by directly locating intermediate events reduces the total number of loadflow iterations required by each margin computation and provides sensitivity information at minimal additional cost. Estimates of the effect of simultaneous transfers on the transfer margins agree well with the exact computations for a network model derived from a portion of the U.S grid. The accuracy of the estimates over a useful range of conditions and the ease of obtaining the estimates suggest that the sensitivity computations will be of practical value.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-04-06
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-01-01
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
Genome-Wide Analysis of Gene-Gene and Gene-Environment Interactions Using Closed-Form Wald Tests.
Yu, Zhaoxia; Demetriou, Michael; Gillen, Daniel L
2015-09-01
Despite the successful discovery of hundreds of variants for complex human traits using genome-wide association studies, the degree to which genes and environmental risk factors jointly affect disease risk is largely unknown. One obstacle toward this goal is that the computational effort required for testing gene-gene and gene-environment interactions is enormous. As a result, numerous computationally efficient tests were recently proposed. However, the validity of these methods often relies on unrealistic assumptions such as additive main effects, main effects at only one variable, no linkage disequilibrium between the two single-nucleotide polymorphisms (SNPs) in a pair or gene-environment independence. Here, we derive closed-form and consistent estimates for interaction parameters and propose to use Wald tests for testing interactions. The Wald tests are asymptotically equivalent to the likelihood ratio tests (LRTs), largely considered to be the gold standard tests but generally too computationally demanding for genome-wide interaction analysis. Simulation studies show that the proposed Wald tests have very similar performances with the LRTs but are much more computationally efficient. Applying the proposed tests to a genome-wide study of multiple sclerosis, we identify interactions within the major histocompatibility complex region. In this application, we find that (1) focusing on pairs where both SNPs are marginally significant leads to more significant interactions when compared to focusing on pairs where at least one SNP is marginally significant; and (2) parsimonious parameterization of interaction effects might decrease, rather than increase, statistical power. © 2015 WILEY PERIODICALS, INC.
GRID-BASED EXPLORATION OF COSMOLOGICAL PARAMETER SPACE WITH SNAKE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mikkelsen, K.; Næss, S. K.; Eriksen, H. K., E-mail: kristin.mikkelsen@astro.uio.no
2013-11-10
We present a fully parallelized grid-based parameter estimation algorithm for investigating multidimensional likelihoods called Snake, and apply it to cosmological parameter estimation. The basic idea is to map out the likelihood grid-cell by grid-cell according to decreasing likelihood, and stop when a certain threshold has been reached. This approach improves vastly on the 'curse of dimensionality' problem plaguing standard grid-based parameter estimation simply by disregarding grid cells with negligible likelihood. The main advantages of this method compared to standard Metropolis-Hastings Markov Chain Monte Carlo methods include (1) trivial extraction of arbitrary conditional distributions; (2) direct access to Bayesian evidences; (3)more » better sampling of the tails of the distribution; and (4) nearly perfect parallelization scaling. The main disadvantage is, as in the case of brute-force grid-based evaluation, a dependency on the number of parameters, N{sub par}. One of the main goals of the present paper is to determine how large N{sub par} can be, while still maintaining reasonable computational efficiency; we find that N{sub par} = 12 is well within the capabilities of the method. The performance of the code is tested by comparing cosmological parameters estimated using Snake and the WMAP-7 data with those obtained using CosmoMC, the current standard code in the field. We find fully consistent results, with similar computational expenses, but shorter wall time due to the perfect parallelization scheme.« less
Confidence in outcome estimates from systematic reviews used in informed consent.
Fritz, Robert; Bauer, Janet G; Spackman, Sue S; Bains, Amanjyot K; Jetton-Rangel, Jeanette
2016-12-01
Evidence-based dentistry now guides informed consent in which clinicians are obliged to provide patients with the most current, best evidence, or best estimates of outcomes, of regimens, therapies, treatments, procedures, materials, and equipment or devices when developing personal oral health care, treatment plans. Yet, clinicians require that the estimates provided from systematic reviews be verified to their validity, reliability, and contextualized as to performance competency so that clinicians may have confidence in explaining outcomes to patients in clinical practice. The purpose of this paper was to describe types of informed estimates from which clinicians may have confidence in their capacity to assist patients in competent decision-making, one of the most important concepts of informed consent. Using systematic review methodology, researchers provide clinicians with valid best estimates of outcomes regarding a subject of interest from best evidence. Best evidence is verified through critical appraisals using acceptable sampling methodology either by scoring instruments (Timmer analysis) or checklist (grade), a Cochrane Collaboration standard that allows transparency in open reviews. These valid best estimates are then tested for reliability using large databases. Finally, valid and reliable best estimates are assessed for meaning using quantification of margins and uncertainties. Through manufacturer and researcher specifications, quantification of margins and uncertainties develops a performance competency continuum by which valid, reliable best estimates may be contextualized for their performance competency: at a lowest margin performance competency (structural failure), high margin performance competency (estimated true value of success), or clinically determined critical values (clinical failure). Informed consent may be achieved when clinicians are confident of their ability to provide useful and accurate best estimates of outcomes regarding regimens, therapies, treatments, and equipment or devices to patients in their clinical practices and when developing personal, oral health care, treatment plans. Copyright © 2016 Elsevier Inc. All rights reserved.
WEIGHTED LIKELIHOOD ESTIMATION UNDER TWO-PHASE SAMPLING
Saegusa, Takumi; Wellner, Jon A.
2013-01-01
We develop asymptotic theory for weighted likelihood estimators (WLE) under two-phase stratified sampling without replacement. We also consider several variants of WLEs involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko–Cantelli theorem, a theorem for rates of convergence of M-estimators, and a Donsker theorem for the inverse probability weighted empirical processes under two-phase sampling and sampling without replacement at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a finite-dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is estimable either at regular or nonregular rates. We illustrate these results and methods in the Cox model with right censoring and interval censoring. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase. PMID:24563559
Maximum Likelihood Estimation of Nonlinear Structural Equation Models with Ignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Lee, John C. K.
2003-01-01
The existing maximum likelihood theory and its computer software in structural equation modeling are established on the basis of linear relationships among latent variables with fully observed data. However, in social and behavioral sciences, nonlinear relationships among the latent variables are important for establishing more meaningful models…
Multilevel and Latent Variable Modeling with Composite Links and Exploded Likelihoods
ERIC Educational Resources Information Center
Rabe-Hesketh, Sophia; Skrondal, Anders
2007-01-01
Composite links and exploded likelihoods are powerful yet simple tools for specifying a wide range of latent variable models. Applications considered include survival or duration models, models for rankings, small area estimation with census information, models for ordinal responses, item response models with guessing, randomized response models,…
Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi
2015-07-01
Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
Estimating residual fault hitting rates by recapture sampling
NASA Technical Reports Server (NTRS)
Lee, Larry; Gupta, Rajan
1988-01-01
For the recapture debugging design introduced by Nayak (1988) the problem of estimating the hitting rates of the faults remaining in the system is considered. In the context of a conditional likelihood, moment estimators are derived and are shown to be asymptotically normal and fully efficient. Fixed sample properties of the moment estimators are compared, through simulation, with those of the conditional maximum likelihood estimators. Properties of the conditional model are investigated such as the asymptotic distribution of linear functions of the fault hitting frequencies and a representation of the full data vector in terms of a sequence of independent random vectors. It is assumed that the residual hitting rates follow a log linear rate model and that the testing process is truncated when the gaps between the detection of new errors exceed a fixed amount of time.
Galili, Tal; Meilijson, Isaac
2016-01-02
The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Fibre tip pH sensor for tumor detection during surgery
NASA Astrophysics Data System (ADS)
Henderson, Matthew R.; Schartner, Erik P.; Callen, David F.; Gill, P. Grantley; Monro, Tanya M.
2015-05-01
Surgery on tumours commonly involves a lumpectomy method, where a section of tissue containing the tumour is removed, to improve cosmetic outcomes and quality of life. Following surgery, the margins of the removed section are checked by pathology tests to ensure that the entire tumour has been removed. Unfortunately, approximately 15-20% of margins show incomplete removal and require a subsequent operation to remove the remaining tumour. Tumour detection during surgery could allow the removed section to be enlarged appropriately, reducing the likelihood of requiring subsequent surgery. A change in the extracellular pH in the vicinity of a tumour, when compared to normal tissue, has been shown previously in literature. We have fabricated an optical fibre tip pH sensor by embedding a fluorophore within a photopolymerised acrylamide polymer on the tip of a 200 micron diameter silica fibre. Preliminary measurements of human melanoma samples have shown a significant difference in the measured pH values between tumour and normal tissue. This demonstration paves to way to highly accurate margin detection during surgery.
Dang, Cuong Cao; Lefort, Vincent; Le, Vinh Sy; Le, Quang Si; Gascuel, Olivier
2011-10-01
Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number of general purpose matrices have been proposed (e.g. JTT, WAG, LG) since the seminal work of Margaret Dayhoff and co-workers. However, it has been shown that matrices specific to certain protein groups (e.g. mitochondrial) or life domains (e.g. viruses) differ significantly from general average matrices, and thus perform better when applied to the data to which they are dedicated. This Web server implements the maximum-likelihood estimation procedure that was used to estimate LG, and provides a number of tools and facilities. Users upload a set of multiple protein alignments from their domain of interest and receive the resulting matrix by email, along with statistics and comparisons with other matrices. A non-parametric bootstrap is performed optionally to assess the variability of replacement rate estimates. Maximum-likelihood trees, inferred using the estimated rate matrix, are also computed optionally for each input alignment. Finely tuned procedures and up-to-date ML software (PhyML 3.0, XRATE) are combined to perform all these heavy calculations on our clusters. http://www.atgc-montpellier.fr/ReplacementMatrix/ olivier.gascuel@lirmm.fr Supplementary data are available at http://www.atgc-montpellier.fr/ReplacementMatrix/
DOE Office of Scientific and Technical Information (OSTI.GOV)
West, R. Derek; Gunther, Jacob H.; Moon, Todd K.
In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
West, R. Derek; Gunther, Jacob H.; Moon, Todd K.
2016-12-01
In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.
Han, Lei; Zhang, Yu; Zhang, Tong
2016-08-01
The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ 1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix. One motivation for this assumption is that the low-rank structure is common in many applications including the climate and financial analysis, and another one is that such assumption can reduce the computational complexity when computing its inverse. Specifically, we propose an efficient COmponent Pursuit (COP) method to obtain the low-rank part, where each component can be sparse. For optimization, the COP method greedily learns a rank-one component in each iteration by maximizing the log-likelihood. Moreover, the COP algorithm enjoys several appealing properties including the existence of an efficient solution in each iteration and the theoretical guarantee on the convergence of this greedy approach. Experiments on large-scale synthetic and real-world datasets including thousands of millions variables show that the COP method is faster than the state-of-the-art techniques for the inverse covariance estimation problem when achieving comparable log-likelihood on test data.
A general methodology for maximum likelihood inference from band-recovery data
Conroy, M.J.; Williams, B.K.
1984-01-01
A numerical procedure is described for obtaining maximum likelihood estimates and associated maximum likelihood inference from band- recovery data. The method is used to illustrate previously developed one-age-class band-recovery models, and is extended to new models, including the analysis with a covariate for survival rates and variable-time-period recovery models. Extensions to R-age-class band- recovery, mark-recapture models, and twice-yearly marking are discussed. A FORTRAN program provides computations for these models.
Program for Weibull Analysis of Fatigue Data
NASA Technical Reports Server (NTRS)
Krantz, Timothy L.
2005-01-01
A Fortran computer program has been written for performing statistical analyses of fatigue-test data that are assumed to be adequately represented by a two-parameter Weibull distribution. This program calculates the following: (1) Maximum-likelihood estimates of the Weibull distribution; (2) Data for contour plots of relative likelihood for two parameters; (3) Data for contour plots of joint confidence regions; (4) Data for the profile likelihood of the Weibull-distribution parameters; (5) Data for the profile likelihood of any percentile of the distribution; and (6) Likelihood-based confidence intervals for parameters and/or percentiles of the distribution. The program can account for tests that are suspended without failure (the statistical term for such suspension of tests is "censoring"). The analytical approach followed in this program for the software is valid for type-I censoring, which is the removal of unfailed units at pre-specified times. Confidence regions and intervals are calculated by use of the likelihood-ratio method.
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...
2014-10-16
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.
2014-01-01
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. PMID:25329157
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pražnikar, Jure; University of Primorska,; Turk, Dušan, E-mail: dusan.turk@ijs.si
2014-12-01
The maximum-likelihood free-kick target, which calculates model error estimates from the work set and a randomly displaced model, proved superior in the accuracy and consistency of refinement of crystal structures compared with the maximum-likelihood cross-validation target, which calculates error estimates from the test set and the unperturbed model. The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. Theymore » utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of R{sub free} or may leave it out completely.« less
Script-theory virtual case: A novel tool for education and research.
Hayward, Jake; Cheung, Amandy; Velji, Alkarim; Altarejos, Jenny; Gill, Peter; Scarfe, Andrew; Lewis, Melanie
2016-11-01
Context/Setting: The script theory of diagnostic reasoning proposes that clinicians evaluate cases in the context of an "illness script," iteratively testing internal hypotheses against new information eventually reaching a diagnosis. We present a novel tool for teaching diagnostic reasoning to undergraduate medical students based on an adaptation of script theory. We developed a virtual patient case that used clinically authentic audio and video, interactive three-dimensional (3D) body images, and a simulated electronic medical record. Next, we used interactive slide bars to record respondents' likelihood estimates of diagnostic possibilities at various stages of the case. Responses were dynamically compared to data from expert clinicians and peers. Comparative frequency distributions were presented to the learner and final diagnostic likelihood estimates were analyzed. Detailed student feedback was collected. Over two academic years, 322 students participated. Student diagnostic likelihood estimates were similar year to year, but were consistently different from expert clinician estimates. Student feedback was overwhelmingly positive: students found the case was novel, innovative, clinically authentic, and a valuable learning experience. We demonstrate the successful implementation of a novel approach to teaching diagnostic reasoning. Future study may delineate reasoning processes associated with differences between novice and expert responses.
Shen, Yi; Dai, Wei; Richards, Virginia M
2015-03-01
A MATLAB toolbox for the efficient estimation of the threshold, slope, and lapse rate of the psychometric function is described. The toolbox enables the efficient implementation of the updated maximum-likelihood (UML) procedure. The toolbox uses an object-oriented architecture for organizing the experimental variables and computational algorithms, which provides experimenters with flexibility in experimental design and data management. Descriptions of the UML procedure and the UML Toolbox are provided, followed by toolbox use examples. Finally, guidelines and recommendations of parameter configurations are given.
Wald Sequential Probability Ratio Test for Analysis of Orbital Conjunction Data
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis; Gold, Dara
2013-01-01
We propose a Wald Sequential Probability Ratio Test for analysis of commonly available predictions associated with spacecraft conjunctions. Such predictions generally consist of a relative state and relative state error covariance at the time of closest approach, under the assumption that prediction errors are Gaussian. We show that under these circumstances, the likelihood ratio of the Wald test reduces to an especially simple form, involving the current best estimate of collision probability, and a similar estimate of collision probability that is based on prior assumptions about the likelihood of collision.
NASA Astrophysics Data System (ADS)
Orans, Ren
1990-10-01
Existing procedures used to develop marginal costs for electric utilities were not designed for applications in an increasingly competitive market for electric power. The utility's value of receiving power, or the costs of selling power, however, depend on the exact location of the buyer or seller, the magnitude of the power and the period of time over which the power is used. Yet no electric utility in the United States has disaggregate marginal costs that reflect differences in costs due to the time, size or location of the load associated with their power or energy transactions. The existing marginal costing methods used by electric utilities were developed in response to the Public Utilities Regulatory Policy Act (PURPA) in 1978. The "ratemaking standards" (Title 1) established by PURPA were primarily concerned with the appropriate segmentation of total revenues to various classes-of-service, designing time-of-use rating periods, and the promotion of efficient long-term resource planning. By design, the methods were very simple and inexpensive to implement. Now, more than a decade later, the costing issues facing electric utilities are becoming increasingly complex, and the benefits of developing more specific marginal costs will outweigh the costs of developing this information in many cases. This research develops a framework for estimating total marginal costs that vary by the size, timing, and the location of changes in loads within an electric distribution system. To complement the existing work at the Electric Power Research Institute (EPRI) and Pacific Gas and Electric Company (PGandE) on estimating disaggregate generation and transmission capacity costs, this dissertation focuses on the estimation of distribution capacity costs. While the costing procedure is suitable for the estimation of total (generation, transmission and distribution) marginal costs, the empirical work focuses on the geographic disaggregation of marginal costs related to electric utility distribution investment. The study makes use of data from an actual distribution planning area, located within PGandE's service territory, to demonstrate the important characteristics of this new costing approach. The most significant result of this empirical work is that geographic differences in the cost of capacity in distribution systems can be as much as four times larger than the current system average utility estimates. Furthermore, lumpy capital investment patterns can lead to significant cost differences over time.
Bayesian Methods for the Physical Sciences. Learning from Examples in Astronomy and Physics.
NASA Astrophysics Data System (ADS)
Andreon, Stefano; Weaver, Brian
2015-05-01
Chapter 1: This chapter presents some basic steps for performing a good statistical analysis, all summarized in about one page. Chapter 2: This short chapter introduces the basics of probability theory inan intuitive fashion using simple examples. It also illustrates, again with examples, how to propagate errors and the difference between marginal and profile likelihoods. Chapter 3: This chapter introduces the computational tools and methods that we use for sampling from the posterior distribution. Since all numerical computations, and Bayesian ones are no exception, may end in errors, we also provide a few tips to check that the numerical computation is sampling from the posterior distribution. Chapter 4: Many of the concepts of building, running, and summarizing the resultsof a Bayesian analysis are described with this step-by-step guide using a basic (Gaussian) model. The chapter also introduces examples using Poisson and Binomial likelihoods, and how to combine repeated independent measurements. Chapter 5: All statistical analyses make assumptions, and Bayesian analyses are no exception. This chapter emphasizes that results depend on data and priors (assumptions). We illustrate this concept with examples where the prior plays greatly different roles, from major to negligible. We also provide some advice on how to look for information useful for sculpting the prior. Chapter 6: In this chapter we consider examples for which we want to estimate more than a single parameter. These common problems include estimating location and spread. We also consider examples that require the modeling of two populations (one we are interested in and a nuisance population) or averaging incompatible measurements. We also introduce quite complex examples dealing with upper limits and with a larger-than-expected scatter. Chapter 7: Rarely is a sample randomly selected from the population we wish to study. Often, samples are affected by selection effects, e.g., easier-to-collect events or objects are over-represented in samples and difficult-to-collect are under-represented if not missing altogether. In this chapter we show how to account for non-random data collection to infer the properties of the population from the studied sample. Chapter 8: In this chapter we introduce regression models, i.e., how to fit (regress) one, or more quantities, against each other through a functional relationship and estimate any unknown parameters that dictate this relationship. Questions of interest include: how to deal with samples affected by selection effects? How does a rich data structure influence the fitted parameters? And what about non-linear multiple-predictor fits, upper/lower limits, measurements errors of different amplitudes and an intrinsic variety in the studied populations or an extra source of variability? A number of examples illustrate how to answer these questions and how to predict the value of an unavailable quantity by exploiting the existence of a trend with another, available, quantity. Chapter 9: This chapter provides some advice on how the careful scientist should perform model checking and sensitivity analysis, i.e., how to answer the following questions: is the considered model at odds with the current available data (the fitted data), for example because it is over-simplified compared to some specific complexity pointed out by the data? Furthermore, are the data informative about the quantity being measured or are results sensibly dependent on details of the fitted model? And, finally, what about if assumptions are uncertain? A number of examples illustrate how to answer these questions. Chapter 10: This chapter compares the performance of Bayesian methods against simple, non-Bayesian alternatives, such as maximum likelihood, minimal chi square, ordinary and weighted least square, bivariate correlated errors and intrinsic scatter, and robust estimates of location and scale. Performances are evaluated in terms of quality of the prediction, accuracy of the estimates, and fairness and noisiness of the quoted errors. We also focus on three failures of maximum likelihood methods occurring with small samples, with mixtures, and with regressions with errors in the predictor quantity.
Proportion estimation using prior cluster purities
NASA Technical Reports Server (NTRS)
Terrell, G. R. (Principal Investigator)
1980-01-01
The prior distribution of CLASSY component purities is studied, and this information incorporated into maximum likelihood crop proportion estimators. The method is tested on Transition Year spring small grain segments.
Hurdle models for multilevel zero-inflated data via h-likelihood.
Molas, Marek; Lesaffre, Emmanuel
2010-12-30
Count data often exhibit overdispersion. One type of overdispersion arises when there is an excess of zeros in comparison with the standard Poisson distribution. Zero-inflated Poisson and hurdle models have been proposed to perform a valid likelihood-based analysis to account for the surplus of zeros. Further, data often arise in clustered, longitudinal or multiple-membership settings. The proper analysis needs to reflect the design of a study. Typically random effects are used to account for dependencies in the data. We examine the h-likelihood estimation and inference framework for hurdle models with random effects for complex designs. We extend the h-likelihood procedures to fit hurdle models, thereby extending h-likelihood to truncated distributions. Two applications of the methodology are presented. Copyright © 2010 John Wiley & Sons, Ltd.
Verifiable Adaptive Control with Analytical Stability Margins by Optimal Control Modification
NASA Technical Reports Server (NTRS)
Nguyen, Nhan T.
2010-01-01
This paper presents a verifiable model-reference adaptive control method based on an optimal control formulation for linear uncertain systems. A predictor model is formulated to enable a parameter estimation of the system parametric uncertainty. The adaptation is based on both the tracking error and predictor error. Using a singular perturbation argument, it can be shown that the closed-loop system tends to a linear time invariant model asymptotically under an assumption of fast adaptation. A stability margin analysis is given to estimate a lower bound of the time delay margin using a matrix measure method. Using this analytical method, the free design parameter n of the optimal control modification adaptive law can be determined to meet a specification of stability margin for verification purposes.
User's manual for MMLE3, a general FORTRAN program for maximum likelihood parameter estimation
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1980-01-01
A user's manual for the FORTRAN IV computer program MMLE3 is described. It is a maximum likelihood parameter estimation program capable of handling general bilinear dynamic equations of arbitrary order with measurement noise and/or state noise (process noise). The theory and use of the program is described. The basic MMLE3 program is quite general and, therefore, applicable to a wide variety of problems. The basic program can interact with a set of user written problem specific routines to simplify the use of the program on specific systems. A set of user routines for the aircraft stability and control derivative estimation problem is provided with the program.
Multilevel modeling of single-case data: A comparison of maximum likelihood and Bayesian estimation.
Moeyaert, Mariola; Rindskopf, David; Onghena, Patrick; Van den Noortgate, Wim
2017-12-01
The focus of this article is to describe Bayesian estimation, including construction of prior distributions, and to compare parameter recovery under the Bayesian framework (using weakly informative priors) and the maximum likelihood (ML) framework in the context of multilevel modeling of single-case experimental data. Bayesian estimation results were found similar to ML estimation results in terms of the treatment effect estimates, regardless of the functional form and degree of information included in the prior specification in the Bayesian framework. In terms of the variance component estimates, both the ML and Bayesian estimation procedures result in biased and less precise variance estimates when the number of participants is small (i.e., 3). By increasing the number of participants to 5 or 7, the relative bias is close to 5% and more precise estimates are obtained for all approaches, except for the inverse-Wishart prior using the identity matrix. When a more informative prior was added, more precise estimates for the fixed effects and random effects were obtained, even when only 3 participants were included. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Dorazio, R.M.; Rago, P.J.
1991-01-01
We simulated mark–recapture experiments to evaluate a method for estimating fishing mortality and migration rates of populations stratified at release and recovery. When fish released in two or more strata were recovered from different recapture strata in nearly the same proportions, conditional recapture probabilities were estimated outside the [0, 1] interval. The maximum likelihood estimates tended to be biased and imprecise when the patterns of recaptures produced extremely "flat" likelihood surfaces. Absence of bias was not guaranteed, however, in experiments where recapture rates could be estimated within the [0, 1] interval. Inadequate numbers of tag releases and recoveries also produced biased estimates, although the bias was easily detected by the high sampling variability of the estimates. A stratified tag–recapture experiment with sockeye salmon (Oncorhynchus nerka) was used to demonstrate procedures for analyzing data that produce biased estimates of recapture probabilities. An estimator was derived to examine the sensitivity of recapture rate estimates to assumed differences in natural and tagging mortality, tag loss, and incomplete reporting of tag recoveries.
Ridge-trench collision in Archean and Post-Archean crustal growth: Evidence from southern Chile
NASA Technical Reports Server (NTRS)
Nelson, E. P.; Forsythe, R. D.
1988-01-01
The growth of continental crust at convergent plate margins involves both continuous and episodic processes. Ridge-trench collision is one episodic process that can cause significant magmatic and tectonic effects on convergent plate margins. Because the sites of ridge collision (ridge-trench triple junctions) generally migrate along convergent plate boundaries, the effects of ridge collision will be highly diachronous in Andean-type orogenic belts and may not be adequately recognized in the geologic record. The Chile margin triple junction (CMTJ, 46 deg S), where the actively spreading Chile rise is colliding with the sediment-filled Peru-Chile trench, is geometrically and kinematically the simplest modern example of ridge collision. The south Chile margin illustrates the importance of the ridge-collision tectonic setting in crustal evolution at convergent margins. Similarities between ridge-collision features in southern Chile and features of Archean greenstone belts raise the question of the importance of ridge collision in Archean crustal growth. Archean plate tectonic processes were probably different than today; these differences may have affected the nature and importance of ridge collision during Archean crustal growth. In conclusion, it is suggested that smaller plates, greater ridge length, and/or faster spreading all point to the likelihood that ridge collision played a greater role in crustal growth and development of the greenstone-granite terranes during the Archean. However, the effects of modern ridge collision, and the processes involved, are not well enough known to develop specific models for the Archean ridge collison.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.
Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J
2015-07-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.
Quasar microlensing models with constraints on the Quasar light curves
NASA Astrophysics Data System (ADS)
Tie, S. S.; Kochanek, C. S.
2018-01-01
Quasar microlensing analyses implicitly generate a model of the variability of the source quasar. The implied source variability may be unrealistic yet its likelihood is generally not evaluated. We used the damped random walk (DRW) model for quasar variability to evaluate the likelihood of the source variability and applied the revized algorithm to a microlensing analysis of the lensed quasar RX J1131-1231. We compared estimates of the size of the quasar disc and the average stellar mass of the lens galaxy with and without applying the DRW likelihoods for the source variability model and found no significant effect on the estimated physical parameters. The most likely explanation is that unreliastic source light-curve models are generally associated with poor microlensing fits that already make a negligible contribution to the probability distributions of the derived parameters.
Exploring the Subtleties of Inverse Probability Weighting and Marginal Structural Models.
Breskin, Alexander; Cole, Stephen R; Westreich, Daniel
2018-05-01
Since being introduced to epidemiology in 2000, marginal structural models have become a commonly used method for causal inference in a wide range of epidemiologic settings. In this brief report, we aim to explore three subtleties of marginal structural models. First, we distinguish marginal structural models from the inverse probability weighting estimator, and we emphasize that marginal structural models are not only for longitudinal exposures. Second, we explore the meaning of the word "marginal" in "marginal structural model." Finally, we show that the specification of a marginal structural model can have important implications for the interpretation of its parameters. Each of these concepts have important implications for the use and understanding of marginal structural models, and thus providing detailed explanations of them may lead to better practices for the field of epidemiology.
Estimation of Multinomial Probabilities.
1978-11-01
1971) and Alam (1978) have shown that the maximum likelihood estimator is admissible with respect to the quadratic loss. Steinhaus (1957) and Trybula...appear). Johnson, B. Mck. (1971). On admissible estimators for certain fixed sample binomial populations. Ann. Math. Statist. 92, 1579-1587. Steinhaus , H
Fitting power-laws in empirical data with estimators that work for all exponents
Hanel, Rudolf; Corominas-Murtra, Bernat; Liu, Bo; Thurner, Stefan
2017-01-01
Most standard methods based on maximum likelihood (ML) estimates of power-law exponents can only be reliably used to identify exponents smaller than minus one. The argument that power laws are otherwise not normalizable, depends on the underlying sample space the data is drawn from, and is true only for sample spaces that are unbounded from above. Power-laws obtained from bounded sample spaces (as is the case for practically all data related problems) are always free of such limitations and maximum likelihood estimates can be obtained for arbitrary powers without restrictions. Here we first derive the appropriate ML estimator for arbitrary exponents of power-law distributions on bounded discrete sample spaces. We then show that an almost identical estimator also works perfectly for continuous data. We implemented this ML estimator and discuss its performance with previous attempts. We present a general recipe of how to use these estimators and present the associated computer codes. PMID:28245249
Fang, Yun; Wu, Hulin; Zhu, Li-Xing
2011-07-01
We propose a two-stage estimation method for random coefficient ordinary differential equation (ODE) models. A maximum pseudo-likelihood estimator (MPLE) is derived based on a mixed-effects modeling approach and its asymptotic properties for population parameters are established. The proposed method does not require repeatedly solving ODEs, and is computationally efficient although it does pay a price with the loss of some estimation efficiency. However, the method does offer an alternative approach when the exact likelihood approach fails due to model complexity and high-dimensional parameter space, and it can also serve as a method to obtain the starting estimates for more accurate estimation methods. In addition, the proposed method does not need to specify the initial values of state variables and preserves all the advantages of the mixed-effects modeling approach. The finite sample properties of the proposed estimator are studied via Monte Carlo simulations and the methodology is also illustrated with application to an AIDS clinical data set.
Cohn, Timothy A.
2005-01-01
This paper presents an adjusted maximum likelihood estimator (AMLE) that can be used to estimate fluvial transport of contaminants, like phosphorus, that are subject to censoring because of analytical detection limits. The AMLE is a generalization of the widely accepted minimum variance unbiased estimator (MVUE), and Monte Carlo experiments confirm that it shares essentially all of the MVUE's desirable properties, including high efficiency and negligible bias. In particular, the AMLE exhibits substantially less bias than alternative censored‐data estimators such as the MLE (Tobit) or the MLE followed by a jackknife. As with the MLE and the MVUE the AMLE comes close to achieving the theoretical Frechet‐Cramér‐Rao bounds on its variance. This paper also presents a statistical framework, applicable to both censored and complete data, for understanding and estimating the components of uncertainty associated with load estimates. This can serve to lower the cost and improve the efficiency of both traditional and real‐time water quality monitoring.
Unclassified Publications of Lincoln Laboratory, 1 January - 31 December 1990. Volume 16
1990-12-31
Apr. 1990 ADA223419 Hopped Communication Systems with Nonuniform Hopping Distributions 880 Bistatic Radar Cross Section of a Fenn, A.J. 2 May1990...EXPERIMENT JA-6241 MS-8424 LUNAR PERTURBATION MAXIMUM LIKELIHOOD ALGORITHM JA-6241 JA-6467 LWIR SPECTRAL BAND MAXIMUM LIKELIHOOD ESTIMATOR JA-6476 MS-8466
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods
ERIC Educational Resources Information Center
Zhong, Xiaoling; Yuan, Ke-Hai
2011-01-01
In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for ...
An improved method for bivariate meta-analysis when within-study correlations are unknown.
Hong, Chuan; D Riley, Richard; Chen, Yong
2018-03-01
Multivariate meta-analysis, which jointly analyzes multiple and possibly correlated outcomes in a single analysis, is becoming increasingly popular in recent years. An attractive feature of the multivariate meta-analysis is its ability to account for the dependence between multiple estimates from the same study. However, standard inference procedures for multivariate meta-analysis require the knowledge of within-study correlations, which are usually unavailable. This limits standard inference approaches in practice. Riley et al proposed a working model and an overall synthesis correlation parameter to account for the marginal correlation between outcomes, where the only data needed are those required for a separate univariate random-effects meta-analysis. As within-study correlations are not required, the Riley method is applicable to a wide variety of evidence synthesis situations. However, the standard variance estimator of the Riley method is not entirely correct under many important settings. As a consequence, the coverage of a function of pooled estimates may not reach the nominal level even when the number of studies in the multivariate meta-analysis is large. In this paper, we improve the Riley method by proposing a robust variance estimator, which is asymptotically correct even when the model is misspecified (ie, when the likelihood function is incorrect). Simulation studies of a bivariate meta-analysis, in a variety of settings, show a function of pooled estimates has improved performance when using the proposed robust variance estimator. In terms of individual pooled estimates themselves, the standard variance estimator and robust variance estimator give similar results to the original method, with appropriate coverage. The proposed robust variance estimator performs well when the number of studies is relatively large. Therefore, we recommend the use of the robust method for meta-analyses with a relatively large number of studies (eg, m≥50). When the sample size is relatively small, we recommend the use of the robust method under the working independence assumption. We illustrate the proposed method through 2 meta-analyses. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Salvati, Paola; Bianchi, Cinzia; Hussin, Haydar; Guzzetti, Fausto
2013-04-01
Landslide and flood events in Italy cause wide and severe damage to buildings and infrastructure, and are frequently involved in the loss of human life. The cost estimates of past natural disasters generally refer to the amount of public money used for the restoration of the direct damage, and most commonly do not account for all disaster impacts. Other cost components, including indirect losses, are difficult to quantify and, among these, the cost of human lives. The value of specific human life can be identified with the value of a statistical life (VLS), defined as the value that an individual places on a marginal change in their likelihood of death This is different from the value of an actual life. Based on information of fatal car accidents in Italy, we evaluate the cost that society suffers for the loss of life due to landslide and flood events. Using a catalogue of fatal landslide and flood events, for which information about gender and age of the fatalities is known, we determine the cost that society suffers for the loss of their life. For the purpose, we calculate the economic value in terms of the total income that the working-age population involved in the fatal events would have earned over the course of their life. For the computation, we use the pro-capita income calculated as the ratio between the GDP and the population value in Italy for each year, since 1980. Problems occur for children and retired people that we decided not to include in our estimates.
NASA Astrophysics Data System (ADS)
Lange, J.; O'Shaughnessy, R.; Boyle, M.; Calderón Bustillo, J.; Campanelli, M.; Chu, T.; Clark, J. A.; Demos, N.; Fong, H.; Healy, J.; Hemberger, D. A.; Hinder, I.; Jani, K.; Khamesra, B.; Kidder, L. E.; Kumar, P.; Laguna, P.; Lousto, C. O.; Lovelace, G.; Ossokine, S.; Pfeiffer, H.; Scheel, M. A.; Shoemaker, D. M.; Szilagyi, B.; Teukolsky, S.; Zlochower, Y.
2017-11-01
We present and assess a Bayesian method to interpret gravitational wave signals from binary black holes. Our method directly compares gravitational wave data to numerical relativity (NR) simulations. In this study, we present a detailed investigation of the systematic and statistical parameter estimation errors of this method. This procedure bypasses approximations used in semianalytical models for compact binary coalescence. In this work, we use the full posterior parameter distribution for only generic nonprecessing binaries, drawing inferences away from the set of NR simulations used, via interpolation of a single scalar quantity (the marginalized log likelihood, ln L ) evaluated by comparing data to nonprecessing binary black hole simulations. We also compare the data to generic simulations, and discuss the effectiveness of this procedure for generic sources. We specifically assess the impact of higher order modes, repeating our interpretation with both l ≤2 as well as l ≤3 harmonic modes. Using the l ≤3 higher modes, we gain more information from the signal and can better constrain the parameters of the gravitational wave signal. We assess and quantify several sources of systematic error that our procedure could introduce, including simulation resolution and duration; most are negligible. We show through examples that our method can recover the parameters for equal mass, zero spin, GW150914-like, and unequal mass, precessing spin sources. Our study of this new parameter estimation method demonstrates that we can quantify and understand the systematic and statistical error. This method allows us to use higher order modes from numerical relativity simulations to better constrain the black hole binary parameters.
Berglund, Lars; Garmo, Hans; Lindbäck, Johan; Svärdsudd, Kurt; Zethelius, Björn
2008-09-30
The least-squares estimator of the slope in a simple linear regression model is biased towards zero when the predictor is measured with random error. A corrected slope may be estimated by adding data from a reliability study, which comprises a subset of subjects from the main study. The precision of this corrected slope depends on the design of the reliability study and estimator choice. Previous work has assumed that the reliability study constitutes a random sample from the main study. A more efficient design is to use subjects with extreme values on their first measurement. Previously, we published a variance formula for the corrected slope, when the correction factor is the slope in the regression of the second measurement on the first. In this paper we show that both designs improve by maximum likelihood estimation (MLE). The precision gain is explained by the inclusion of data from all subjects for estimation of the predictor's variance and by the use of the second measurement for estimation of the covariance between response and predictor. The gain of MLE enhances with stronger true relationship between response and predictor and with lower precision in the predictor measurements. We present a real data example on the relationship between fasting insulin, a surrogate marker, and true insulin sensitivity measured by a gold-standard euglycaemic insulin clamp, and simulations, where the behavior of profile-likelihood-based confidence intervals is examined. MLE was shown to be a robust estimator for non-normal distributions and efficient for small sample situations. Copyright (c) 2008 John Wiley & Sons, Ltd.
Likelihood Ratios for Glaucoma Diagnosis Using Spectral Domain Optical Coherence Tomography
Lisboa, Renato; Mansouri, Kaweh; Zangwill, Linda M.; Weinreb, Robert N.; Medeiros, Felipe A.
2014-01-01
Purpose To present a methodology for calculating likelihood ratios for glaucoma diagnosis for continuous retinal nerve fiber layer (RNFL) thickness measurements from spectral domain optical coherence tomography (spectral-domain OCT). Design Observational cohort study. Methods 262 eyes of 187 patients with glaucoma and 190 eyes of 100 control subjects were included in the study. Subjects were recruited from the Diagnostic Innovations Glaucoma Study. Eyes with preperimetric and perimetric glaucomatous damage were included in the glaucoma group. The control group was composed of healthy eyes with normal visual fields from subjects recruited from the general population. All eyes underwent RNFL imaging with Spectralis spectral-domain OCT. Likelihood ratios for glaucoma diagnosis were estimated for specific global RNFL thickness measurements using a methodology based on estimating the tangents to the Receiver Operating Characteristic (ROC) curve. Results Likelihood ratios could be determined for continuous values of average RNFL thickness. Average RNFL thickness values lower than 86μm were associated with positive LRs, i.e., LRs greater than 1; whereas RNFL thickness values higher than 86μm were associated with negative LRs, i.e., LRs smaller than 1. A modified Fagan nomogram was provided to assist calculation of post-test probability of disease from the calculated likelihood ratios and pretest probability of disease. Conclusion The methodology allowed calculation of likelihood ratios for specific RNFL thickness values. By avoiding arbitrary categorization of test results, it potentially allows for an improved integration of test results into diagnostic clinical decision-making. PMID:23972303
A comprehensive assessment of collision likelihood in Geosynchronous Earth Orbit
NASA Astrophysics Data System (ADS)
Oltrogge, D. L.; Alfano, S.; Law, C.; Cacioni, A.; Kelso, T. S.
2018-06-01
Knowing the likelihood of collision for satellites operating in Geosynchronous Earth Orbit (GEO) is of extreme importance and interest to the global community and the operators of GEO spacecraft. Yet for all of its importance, a comprehensive assessment of GEO collision likelihood is difficult to do and has never been done. In this paper, we employ six independent and diverse assessment methods to estimate GEO collision likelihood. Taken in aggregate, this comprehensive assessment offer new insights into GEO collision likelihood that are within a factor of 3.5 of each other. These results are then compared to four collision and seven encounter rate estimates previously published. Collectively, these new findings indicate that collision likelihood in GEO is as much as four orders of magnitude higher than previously published by other researchers. Results indicate that a collision is likely to occur every 4 years for one satellite out of the entire GEO active satellite population against a 1 cm RSO catalogue, and every 50 years against a 20 cm RSO catalogue. Further, previous assertions that collision relative velocities are low (i.e., <1 km/s) in GEO are disproven, with some GEO relative velocities as high as 4 km/s identified. These new findings indicate that unless operators successfully mitigate this collision risk, the GEO orbital arc is and will remain at high risk of collision, with the potential for serious follow-on collision threats from post-collision debris when a substantial GEO collision occurs.
Optimism following a tornado disaster.
Suls, Jerry; Rose, Jason P; Windschitl, Paul D; Smith, Andrew R
2013-05-01
Effects of exposure to a severe weather disaster on perceived future vulnerability were assessed in college students, local residents contacted through random-digit dialing, and community residents of affected versus unaffected neighborhoods. Students and community residents reported being less vulnerable than their peers at 1 month, 6 months, and 1 year after the disaster. In Studies 1 and 2, absolute risk estimates were more optimistic with time, whereas comparative vulnerability was stable. Residents of affected neighborhoods (Study 3), surprisingly, reported less comparative vulnerability and lower "gut-level" numerical likelihood estimates at 6 months, but later their estimates resembled the unaffected residents. Likelihood estimates (10%-12%), however, exceeded the 1% risk calculated by storm experts, and gut-level versus statistical-level estimates were more optimistic. Although people believed they had approximately a 1-in-10 chance of injury from future tornadoes (i.e., an overestimate), they thought their risk was lower than peers.
Zou, W; Ouyang, H
2016-02-01
We propose a multiple estimation adjustment (MEA) method to correct effect overestimation due to selection bias from a hypothesis-generating study (HGS) in pharmacogenetics. MEA uses a hierarchical Bayesian approach to model individual effect estimates from maximal likelihood estimation (MLE) in a region jointly and shrinks them toward the regional effect. Unlike many methods that model a fixed selection scheme, MEA capitalizes on local multiplicity independent of selection. We compared mean square errors (MSEs) in simulated HGSs from naive MLE, MEA and a conditional likelihood adjustment (CLA) method that model threshold selection bias. We observed that MEA effectively reduced MSE from MLE on null effects with or without selection, and had a clear advantage over CLA on extreme MLE estimates from null effects under lenient threshold selection in small samples, which are common among 'top' associations from a pharmacogenetics HGS.
Estimation of parameters of dose volume models and their confidence limits
NASA Astrophysics Data System (ADS)
van Luijk, P.; Delvigne, T. C.; Schilstra, C.; Schippers, J. M.
2003-07-01
Predictions of the normal-tissue complication probability (NTCP) for the ranking of treatment plans are based on fits of dose-volume models to clinical and/or experimental data. In the literature several different fit methods are used. In this work frequently used methods and techniques to fit NTCP models to dose response data for establishing dose-volume effects, are discussed. The techniques are tested for their usability with dose-volume data and NTCP models. Different methods to estimate the confidence intervals of the model parameters are part of this study. From a critical-volume (CV) model with biologically realistic parameters a primary dataset was generated, serving as the reference for this study and describable by the NTCP model. The CV model was fitted to this dataset. From the resulting parameters and the CV model, 1000 secondary datasets were generated by Monte Carlo simulation. All secondary datasets were fitted to obtain 1000 parameter sets of the CV model. Thus the 'real' spread in fit results due to statistical spreading in the data is obtained and has been compared with estimates of the confidence intervals obtained by different methods applied to the primary dataset. The confidence limits of the parameters of one dataset were estimated using the methods, employing the covariance matrix, the jackknife method and directly from the likelihood landscape. These results were compared with the spread of the parameters, obtained from the secondary parameter sets. For the estimation of confidence intervals on NTCP predictions, three methods were tested. Firstly, propagation of errors using the covariance matrix was used. Secondly, the meaning of the width of a bundle of curves that resulted from parameters that were within the one standard deviation region in the likelihood space was investigated. Thirdly, many parameter sets and their likelihood were used to create a likelihood-weighted probability distribution of the NTCP. It is concluded that for the type of dose response data used here, only a full likelihood analysis will produce reliable results. The often-used approximations, such as the usage of the covariance matrix, produce inconsistent confidence limits on both the parameter sets and the resulting NTCP values.
TH-AB-202-04: Auto-Adaptive Margin Generation for MLC-Tracked Radiotherapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glitzner, M; Lagendijk, J; Raaymakers, B
Purpose: To develop an auto-adaptive margin generator for MLC tracking. The generator is able to estimate errors arising in image guided radiotherapy, particularly on an MR-Linac, which depend on the latencies of machine and image processing, as well as on patient motion characteristics. From the estimated error distribution, a segment margin is generated, able to compensate errors up to a user-defined confidence. Method: In every tracking control cycle (TCC, 40ms), the desired aperture D(t) is compared to the actual aperture A(t), a delayed and imperfect representation of D(t). Thus an error e(t)=A(T)-D(T) is measured every TCC. Applying kernel-density-estimation (KDE), themore » cumulative distribution (CDF) of e(t) is estimated. With CDF-confidence limits, upper and lower error limits are extracted for motion axes along and perpendicular leaf-travel direction and applied as margins. To test the dosimetric impact, two representative motion traces were extracted from fast liver-MRI (10Hz). The traces were applied onto a 4D-motion platform and continuously tracked by an Elekta Agility 160 MLC using an artificially imposed tracking delay. Gafchromic film was used to detect dose exposition for static, tracked, and error-compensated tracking cases. The margin generator was parameterized to cover 90% of all tracking errors. Dosimetric impact was rated by calculating the ratio between underexposed points (>5% underdosage) to the total number of points inside FWHM of static exposure. Results: Without imposing adaptive margins, tracking experiments showed a ratio of underexposed points of 17.5% and 14.3% for two motion cases with imaging delays of 200ms and 300ms, respectively. Activating the margin generated yielded total suppression (<1%) of underdosed points. Conclusion: We showed that auto-adaptive error compensation using machine error statistics is possible for MLC tracking. The error compensation margins are calculated on-line, without the need of assuming motion or machine models. Further strategies to reduce consequential overdosages are currently under investigation. This work was funded by the SoRTS consortium, which includes the industry partners Elekta, Philips and Technolution.« less
Task Performance with List-Mode Data
NASA Astrophysics Data System (ADS)
Caucci, Luca
This dissertation investigates the application of list-mode data to detection, estimation, and image reconstruction problems, with an emphasis on emission tomography in medical imaging. We begin by introducing a theoretical framework for list-mode data and we use it to define two observers that operate on list-mode data. These observers are applied to the problem of detecting a signal (known in shape and location) buried in a random lumpy background. We then consider maximum-likelihood methods for the estimation of numerical parameters from list-mode data, and we characterize the performance of these estimators via the so-called Fisher information matrix. Reconstruction from PET list-mode data is then considered. In a process we called "double maximum-likelihood" reconstruction, we consider a simple PET imaging system and we use maximum-likelihood methods to first estimate a parameter vector for each pair of gamma-ray photons that is detected by the hardware. The collection of these parameter vectors forms a list, which is then fed to another maximum-likelihood algorithm for volumetric reconstruction over a grid of voxels. Efficient parallel implementation of the algorithms discussed above is then presented. In this work, we take advantage of two low-cost, mass-produced computing platforms that have recently appeared on the market, and we provide some details on implementing our algorithms on these devices. We conclude this dissertation work by elaborating on a possible application of list-mode data to X-ray digital mammography. We argue that today's CMOS detectors and computing platforms have become fast enough to make X-ray digital mammography list-mode data acquisition and processing feasible.
Fan, L; Liu, S-Y; Li, Q-C; Yu, H; Xiao, X-S
2012-01-01
Objective To evaluate different features between benign and malignant pulmonary focal ground-glass opacity (fGGO) on multidetector CT (MDCT). Methods 82 pathologically or clinically confirmed fGGOs were retrospectively analysed with regard to demographic data, lesion size and location, attenuation value and MDCT features including shape, margin, interface, internal characteristics and adjacent structure. Differences between benign and malignant fGGOs were analysed using a χ2 test, Fisher's exact test or Mann–Whitney U-test. Morphological characteristics were analysed by binary logistic regression analysis to estimate the likelihood of malignancy. Results There were 21 benign and 61 malignant lesions. No statistical differences were found between benign and malignant fGGOs in terms of demographic data, size, location and attenuation value. The frequency of lobulation (p=0.000), spiculation (p=0.008), spine-like process (p=0.004), well-defined but coarse interface (p=0.000), bronchus cut-off (p=0.003), other air-containing space (p=0.000), pleural indentation (p=0.000) and vascular convergence (p=0.006) was significantly higher in malignant fGGOs than that in benign fGGOs. Binary logistic regression analysis showed that lobulation, interface and pleural indentation were important indicators for malignant diagnosis of fGGO, with the corresponding odds ratios of 8.122, 3.139 and 9.076, respectively. In addition, a well-defined but coarse interface was the most important indicator of malignancy among all interface types. With all three important indicators considered, the diagnostic sensitivity, specificity and accuracy were 93.4%, 66.7% and 86.6%, respectively. Conclusion An fGGO with lobulation, a well-defined but coarse interface and pleural indentation gives a greater than average likelihood of being malignant. PMID:22128130
Gruber, Susan; Logan, Roger W; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A
2015-01-15
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V-fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. Copyright © 2014 John Wiley & Sons, Ltd.
Gruber, Susan; Logan, Roger W.; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A.
2014-01-01
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V -fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. PMID:25316152
ESTIMATING PROPORTION OF AREA OCCUPIED UNDER COMPLEX SURVEY DESIGNS
Estimating proportion of sites occupied, or proportion of area occupied (PAO) is a common problem in environmental studies. Typically, field surveys do not ensure that occupancy of a site is made with perfect detection. Maximum likelihood estimation of site occupancy rates when...
NASA Astrophysics Data System (ADS)
Morse, Brad S.; Pohll, Greg; Huntington, Justin; Rodriguez Castillo, Ramiro
2003-06-01
In 1992, Mexican researchers discovered concentrations of arsenic in excess of World Heath Organization (WHO) standards in several municipal wells in the Zimapan Valley of Mexico. This study describes a method to delineate a capture zone for one of the most highly contaminated wells to aid in future well siting. A stochastic approach was used to model the capture zone because of the high level of uncertainty in several input parameters. Two stochastic techniques were performed and compared: "standard" Monte Carlo analysis and the generalized likelihood uncertainty estimator (GLUE) methodology. The GLUE procedure differs from standard Monte Carlo analysis in that it incorporates a goodness of fit (termed a likelihood measure) in evaluating the model. This allows for more information (in this case, head data) to be used in the uncertainty analysis, resulting in smaller prediction uncertainty. Two likelihood measures are tested in this study to determine which are in better agreement with the observed heads. While the standard Monte Carlo approach does not aid in parameter estimation, the GLUE methodology indicates best fit models when hydraulic conductivity is approximately 10-6.5 m/s, with vertically isotropic conditions and large quantities of interbasin flow entering the basin. Probabilistic isochrones (capture zone boundaries) are then presented, and as predicted, the GLUE-derived capture zones are significantly smaller in area than those from the standard Monte Carlo approach.
Silverman, Merav H.; Jedd, Kelly; Luciana, Monica
2015-01-01
Behavioral responses to, and the neural processing of, rewards change dramatically during adolescence and may contribute to observed increases in risk-taking during this developmental period. Functional MRI (fMRI) studies suggest differences between adolescents and adults in neural activation during reward processing, but findings are contradictory, and effects have been found in non-predicted directions. The current study uses an activation likelihood estimation (ALE) approach for quantitative meta-analysis of functional neuroimaging studies to: 1) confirm the network of brain regions involved in adolescents’ reward processing, 2) identify regions involved in specific stages (anticipation, outcome) and valence (positive, negative) of reward processing, and 3) identify differences in activation likelihood between adolescent and adult reward-related brain activation. Results reveal a subcortical network of brain regions involved in adolescent reward processing similar to that found in adults with major hubs including the ventral and dorsal striatum, insula, and posterior cingulate cortex (PCC). Contrast analyses find that adolescents exhibit greater likelihood of activation in the insula while processing anticipation relative to outcome and greater likelihood of activation in the putamen and amygdala during outcome relative to anticipation. While processing positive compared to negative valence, adolescents show increased likelihood for activation in the posterior cingulate cortex (PCC) and ventral striatum. Contrasting adolescent reward processing with the existing ALE of adult reward processing (Liu et al., 2011) reveals increased likelihood for activation in limbic, frontolimbic, and striatal regions in adolescents compared with adults. Unlike adolescents, adults also activate executive control regions of the frontal and parietal lobes. These findings support hypothesized elevations in motivated activity during adolescence. PMID:26254587
Prediction of Flutter Boundary Using Flutter Margin for The Discrete-Time System
NASA Astrophysics Data System (ADS)
Dwi Saputra, Angga; Wibawa Purabaya, R.
2018-04-01
Flutter testing in a wind tunnel is generally conducted at subcritical speeds to avoid damages. Hence, The flutter speed has to be predicted from the behavior some of its stability criteria estimated against the dynamic pressure or flight speed. Therefore, it is quite important for a reliable flutter prediction method to estimates flutter boundary. This paper summarizes the flutter testing of a wing cantilever model in a wind tunnel. The model has two degree of freedom; they are bending and torsion modes. The flutter test was conducted in a subsonic wind tunnel. The dynamic data responses was measured by two accelerometers that were mounted on leading edge and center of wing tip. The measurement was repeated while the wind speed increased. The dynamic responses were used to determine the parameter flutter margin for the discrete-time system. The flutter boundary of the model was estimated using extrapolation of the parameter flutter margin against the dynamic pressure. The parameter flutter margin for the discrete-time system has a better performance for flutter prediction than the modal parameters. A model with two degree freedom and experiencing classical flutter, the parameter flutter margin for the discrete-time system gives a satisfying result in prediction of flutter boundary on subsonic wind tunnel test.
Testing students' e-learning via Facebook through Bayesian structural equation modeling.
Salarzadeh Jenatabadi, Hashem; Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students' intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods' results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated.
Testing students’ e-learning via Facebook through Bayesian structural equation modeling
Moghavvemi, Sedigheh; Wan Mohamed Radzi, Che Wan Jasimah Bt; Babashamsi, Parastoo; Arashi, Mohammad
2017-01-01
Learning is an intentional activity, with several factors affecting students’ intention to use new learning technology. Researchers have investigated technology acceptance in different contexts by developing various theories/models and testing them by a number of means. Although most theories/models developed have been examined through regression or structural equation modeling, Bayesian analysis offers more accurate data analysis results. To address this gap, the unified theory of acceptance and technology use in the context of e-learning via Facebook are re-examined in this study using Bayesian analysis. The data (S1 Data) were collected from 170 students enrolled in a business statistics course at University of Malaya, Malaysia, and tested with the maximum likelihood and Bayesian approaches. The difference between the two methods’ results indicates that performance expectancy and hedonic motivation are the strongest factors influencing the intention to use e-learning via Facebook. The Bayesian estimation model exhibited better data fit than the maximum likelihood estimator model. The results of the Bayesian and maximum likelihood estimator approaches are compared and the reasons for the result discrepancy are deliberated. PMID:28886019