On non-parametric maximum likelihood estimation of the bivariate survivor function.
Prentice, R L
The likelihood function for the bivariate survivor function F, under independent censorship, is maximized to obtain a non-parametric maximum likelihood estimator &Fcirc;. &Fcirc; may or may not be unique depending on the configuration of singly- and doubly-censored pairs. The likelihood function can be maximized by placing all mass on the grid formed by the uncensored failure times, or half lines beyond the failure time grid, or in the upper right quadrant beyond the grid. By accumulating the mass along lines (or regions) where the likelihood is flat, one obtains a partially maximized likelihood as a function of parameters that can be uniquely estimated. The score equations corresponding to these point mass parameters are derived, using a Lagrange multiplier technique to ensure unit total mass, and a modified Newton procedure is used to calculate the parameter estimates in some limited simulation studies. Some considerations for the further development of non-parametric bivariate survivor function estimators are briefly described.
Ning, Jing; Chen, Yong; Piao, Jin
2017-07-01
Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ERIC Educational Resources Information Center
Casabianca, Jodi M.; Lewis, Charles
2015-01-01
Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
An EM Algorithm for Maximum Likelihood Estimation of Process Factor Analysis Models
ERIC Educational Resources Information Center
Lee, Taehun
2010-01-01
In this dissertation, an Expectation-Maximization (EM) algorithm is developed and implemented to obtain maximum likelihood estimates of the parameters and the associated standard error estimates characterizing temporal flows for the latent variable time series following stationary vector ARMA processes, as well as the parameters defining the…
Maximal likelihood correspondence estimation for face recognition across pose.
Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang
2014-10-01
Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
Quantum-state reconstruction by maximizing likelihood and entropy.
Teo, Yong Siah; Zhu, Huangjun; Englert, Berthold-Georg; Řeháček, Jaroslav; Hradil, Zdeněk
2011-07-08
Quantum-state reconstruction on a finite number of copies of a quantum system with informationally incomplete measurements, as a rule, does not yield a unique result. We derive a reconstruction scheme where both the likelihood and the von Neumann entropy functionals are maximized in order to systematically select the most-likely estimator with the largest entropy, that is, the least-bias estimator, consistent with a given set of measurement data. This is equivalent to the joint consideration of our partial knowledge and ignorance about the ensemble to reconstruct its identity. An interesting structure of such estimators will also be explored.
Deterministic quantum annealing expectation-maximization algorithm
NASA Astrophysics Data System (ADS)
Miyahara, Hideyuki; Tsumura, Koji; Sughiyama, Yuki
2017-11-01
Maximum likelihood estimation (MLE) is one of the most important methods in machine learning, and the expectation-maximization (EM) algorithm is often used to obtain maximum likelihood estimates. However, EM heavily depends on initial configurations and fails to find the global optimum. On the other hand, in the field of physics, quantum annealing (QA) was proposed as a novel optimization approach. Motivated by QA, we propose a quantum annealing extension of EM, which we call the deterministic quantum annealing expectation-maximization (DQAEM) algorithm. We also discuss its advantage in terms of the path integral formulation. Furthermore, by employing numerical simulations, we illustrate how DQAEM works in MLE and show that DQAEM moderate the problem of local optima in EM.
Hudson, H M; Ma, J; Green, P
1994-01-01
Many algorithms for medical image reconstruction adopt versions of the expectation-maximization (EM) algorithm. In this approach, parameter estimates are obtained which maximize a complete data likelihood or penalized likelihood, in each iteration. Implicitly (and sometimes explicitly) penalized algorithms require smoothing of the current reconstruction in the image domain as part of their iteration scheme. In this paper, we discuss alternatives to EM which adapt Fisher's method of scoring (FS) and other methods for direct maximization of the incomplete data likelihood. Jacobi and Gauss-Seidel methods for non-linear optimization provide efficient algorithms applying FS in tomography. One approach uses smoothed projection data in its iterations. We investigate the convergence of Jacobi and Gauss-Seidel algorithms with clinical tomographic projection data.
Balakrishnan, Narayanaswamy; Pal, Suvra
2016-08-01
Recently, a flexible cure rate survival model has been developed by assuming the number of competing causes of the event of interest to follow the Conway-Maxwell-Poisson distribution. This model includes some of the well-known cure rate models discussed in the literature as special cases. Data obtained from cancer clinical trials are often right censored and expectation maximization algorithm can be used in this case to efficiently estimate the model parameters based on right censored data. In this paper, we consider the competing cause scenario and assuming the time-to-event to follow the Weibull distribution, we derive the necessary steps of the expectation maximization algorithm for estimating the parameters of different cure rate survival models. The standard errors of the maximum likelihood estimates are obtained by inverting the observed information matrix. The method of inference developed here is examined by means of an extensive Monte Carlo simulation study. Finally, we illustrate the proposed methodology with a real data on cancer recurrence. © The Author(s) 2013.
Estimation and classification by sigmoids based on mutual information
NASA Technical Reports Server (NTRS)
Baram, Yoram
1994-01-01
An estimate of the probability density function of a random vector is obtained by maximizing the mutual information between the input and the output of a feedforward network of sigmoidal units with respect to the input weights. Classification problems can be solved by selecting the class associated with the maximal estimated density. Newton's s method, applied to an estimated density, yields a recursive maximum likelihood estimator, consisting of a single internal layer of sigmoids, for a random variable or a random sequence. Applications to the diamond classification and to the prediction of a sun-spot process are demonstrated.
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
NASA Astrophysics Data System (ADS)
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Maximum likelihood estimates, from censored data, for mixed-Weibull distributions
NASA Astrophysics Data System (ADS)
Jiang, Siyuan; Kececioglu, Dimitri
1992-06-01
A new algorithm for estimating the parameters of mixed-Weibull distributions from censored data is presented. The algorithm follows the principle of maximum likelihood estimate (MLE) through the expectation and maximization (EM) algorithm, and it is derived for both postmortem and nonpostmortem time-to-failure data. It is concluded that the concept of the EM algorithm is easy to understand and apply (only elementary statistics and calculus are required). The log-likelihood function cannot decrease after an EM sequence; this important feature was observed in all of the numerical calculations. The MLEs of the nonpostmortem data were obtained successfully for mixed-Weibull distributions with up to 14 parameters in a 5-subpopulation, mixed-Weibull distribution. Numerical examples indicate that some of the log-likelihood functions of the mixed-Weibull distributions have multiple local maxima; therefore, the algorithm should start at several initial guesses of the parameter set.
A comparison of abundance estimates from extended batch-marking and Jolly–Seber-type experiments
Cowen, Laura L E; Besbeas, Panagiotis; Morgan, Byron J T; Schwarz, Carl J
2014-01-01
Little attention has been paid to the use of multi-sample batch-marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. (2010) present a pseudo-likelihood for a multi-sample batch-marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson-type estimator. We have developed and maximized the likelihood for batch-marking studies. We use data simulated from a Jolly–Seber-type study and convert this to what would have been obtained from an extended batch-marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch-marking model to determine the efficiency of collecting and analyzing batch-marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo-likelihood method of Huggins et al. (2010). When faced with designing a batch-marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size. PMID:24558576
Williamson, Ross S.; Sahani, Maneesh; Pillow, Jonathan W.
2015-01-01
Stimulus dimensionality-reduction methods in neuroscience seek to identify a low-dimensional space of stimulus features that affect a neuron’s probability of spiking. One popular method, known as maximally informative dimensions (MID), uses an information-theoretic quantity known as “single-spike information” to identify this space. Here we examine MID from a model-based perspective. We show that MID is a maximum-likelihood estimator for the parameters of a linear-nonlinear-Poisson (LNP) model, and that the empirical single-spike information corresponds to the normalized log-likelihood under a Poisson model. This equivalence implies that MID does not necessarily find maximally informative stimulus dimensions when spiking is not well described as Poisson. We provide several examples to illustrate this shortcoming, and derive a lower bound on the information lost when spiking is Bernoulli in discrete time bins. To overcome this limitation, we introduce model-based dimensionality reduction methods for neurons with non-Poisson firing statistics, and show that they can be framed equivalently in likelihood-based or information-theoretic terms. Finally, we show how to overcome practical limitations on the number of stimulus dimensions that MID can estimate by constraining the form of the non-parametric nonlinearity in an LNP model. We illustrate these methods with simulations and data from primate visual cortex. PMID:25831448
Pal, Suvra; Balakrishnan, Narayanaswamy
2018-05-01
In this paper, we develop likelihood inference based on the expectation maximization algorithm for the Box-Cox transformation cure rate model assuming the lifetimes to follow a Weibull distribution. A simulation study is carried out to demonstrate the performance of the proposed estimation method. Through Monte Carlo simulations, we also study the effect of model misspecification on the estimate of cure rate. Finally, we analyze a well-known data on melanoma with the model and the inferential method developed here.
He, Ye; Lin, Huazhen; Tu, Dongsheng
2018-06-04
In this paper, we introduce a single-index threshold Cox proportional hazard model to select and combine biomarkers to identify patients who may be sensitive to a specific treatment. A penalized smoothed partial likelihood is proposed to estimate the parameters in the model. A simple, efficient, and unified algorithm is presented to maximize this likelihood function. The estimators based on this likelihood function are shown to be consistent and asymptotically normal. Under mild conditions, the proposed estimators also achieve the oracle property. The proposed approach is evaluated through simulation analyses and application to the analysis of data from two clinical trials, one involving patients with locally advanced or metastatic pancreatic cancer and one involving patients with resectable lung cancer. Copyright © 2018 John Wiley & Sons, Ltd.
Reyes-Valdés, M H; Stelly, D M
1995-01-01
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes. Images Fig. 1 PMID:7568226
ERIC Educational Resources Information Center
Enders, Craig K.; Peugh, James L.
2004-01-01
Two methods, direct maximum likelihood (ML) and the expectation maximization (EM) algorithm, can be used to obtain ML parameter estimates for structural equation models with missing data (MD). Although the 2 methods frequently produce identical parameter estimates, it may be easier to satisfy missing at random assumptions using EM. However, no…
Maximum likelihood solution for inclination-only data in paleomagnetism
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2010-08-01
We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation
Meyer, Karin
2016-01-01
Multivariate estimates of genetic parameters are subject to substantial sampling variation, especially for smaller data sets and more than a few traits. A simple modification of standard, maximum-likelihood procedures for multivariate analyses to estimate genetic covariances is described, which can improve estimates by substantially reducing their sampling variances. This is achieved by maximizing the likelihood subject to a penalty. Borrowing from Bayesian principles, we propose a mild, default penalty—derived assuming a Beta distribution of scale-free functions of the covariance components to be estimated—rather than laboriously attempting to determine the stringency of penalization from the data. An extensive simulation study is presented, demonstrating that such penalties can yield very worthwhile reductions in loss, i.e., the difference from population values, for a wide range of scenarios and without distorting estimates of phenotypic covariances. Moreover, mild default penalties tend not to increase loss in difficult cases and, on average, achieve reductions in loss of similar magnitude to computationally demanding schemes to optimize the degree of penalization. Pertinent details required for the adaptation of standard algorithms to locate the maximum of the likelihood function are outlined. PMID:27317681
A more powerful exact test of noninferiority from binary matched-pairs data.
Lloyd, Chris J; Moldovan, Max V
2008-08-15
Assessing the therapeutic noninferiority of one medical treatment compared with another is often based on the difference in response rates from a matched binary pairs design. This paper develops a new exact unconditional test for noninferiority that is more powerful than available alternatives. There are two new elements presented in this paper. First, we introduce the likelihood ratio statistic as an alternative to the previously proposed score statistic of Nam (Biometrics 1997; 53:1422-1430). Second, we eliminate the nuisance parameter by estimation followed by maximization as an alternative to the partial maximization of Berger and Boos (Am. Stat. Assoc. 1994; 89:1012-1016) or traditional full maximization. Based on an extensive numerical study, we recommend tests based on the score statistic, the nuisance parameter being controlled by estimation followed by maximization. 2008 John Wiley & Sons, Ltd
NASA Astrophysics Data System (ADS)
Hui, Z.; Cheng, P.; Ziggah, Y. Y.; Nie, Y.
2018-04-01
Filtering is a key step for most applications of airborne LiDAR point clouds. Although lots of filtering algorithms have been put forward in recent years, most of them suffer from parameters setting or thresholds adjusting, which will be time-consuming and reduce the degree of automation of the algorithm. To overcome this problem, this paper proposed a threshold-free filtering algorithm based on expectation-maximization. The proposed algorithm is developed based on an assumption that point clouds are seen as a mixture of Gaussian models. The separation of ground points and non-ground points from point clouds can be replaced as a separation of a mixed Gaussian model. Expectation-maximization (EM) is applied for realizing the separation. EM is used to calculate maximum likelihood estimates of the mixture parameters. Using the estimated parameters, the likelihoods of each point belonging to ground or object can be computed. After several iterations, point clouds can be labelled as the component with a larger likelihood. Furthermore, intensity information was also utilized to optimize the filtering results acquired using the EM method. The proposed algorithm was tested using two different datasets used in practice. Experimental results showed that the proposed method can filter non-ground points effectively. To quantitatively evaluate the proposed method, this paper adopted the dataset provided by the ISPRS for the test. The proposed algorithm can obtain a 4.48 % total error which is much lower than most of the eight classical filtering algorithms reported by the ISPRS.
ERIC Educational Resources Information Center
Song, Hairong; Ferrer, Emilio
2009-01-01
This article presents a state-space modeling (SSM) technique for fitting process factor analysis models directly to raw data. The Kalman smoother via the expectation-maximization algorithm to obtain maximum likelihood parameter estimates is used. To examine the finite sample properties of the estimates in SSM when common factors are involved, a…
Maximum likelihood estimation for periodic autoregressive moving average models
Vecchia, A.V.
1985-01-01
A useful class of models for seasonal time series that cannot be filtered or standardized to achieve second-order stationarity is that of periodic autoregressive moving average (PARMA) models, which are extensions of ARMA models that allow periodic (seasonal) parameters. An approximation to the exact likelihood for Gaussian PARMA processes is developed, and a straightforward algorithm for its maximization is presented. The algorithm is tested on several periodic ARMA(1, 1) models through simulation studies and is compared to moment estimation via the seasonal Yule-Walker equations. Applicability of the technique is demonstrated through an analysis of a seasonal stream-flow series from the Rio Caroni River in Venezuela.
Aralis, Hilary; Brookmeyer, Ron
2017-01-01
Multistate models provide an important method for analyzing a wide range of life history processes including disease progression and patient recovery following medical intervention. Panel data consisting of the states occupied by an individual at a series of discrete time points are often used to estimate transition intensities of the underlying continuous-time process. When transition intensities depend on the time elapsed in the current state and back transitions between states are possible, this intermittent observation process presents difficulties in estimation due to intractability of the likelihood function. In this manuscript, we present an iterative stochastic expectation-maximization algorithm that relies on a simulation-based approximation to the likelihood function and implement this algorithm using rejection sampling. In a simulation study, we demonstrate the feasibility and performance of the proposed procedure. We then demonstrate application of the algorithm to a study of dementia, the Nun Study, consisting of intermittently-observed elderly subjects in one of four possible states corresponding to intact cognition, impaired cognition, dementia, and death. We show that the proposed stochastic expectation-maximization algorithm substantially reduces bias in model parameter estimates compared to an alternative approach used in the literature, minimal path estimation. We conclude that in estimating intermittently observed semi-Markov models, the proposed approach is a computationally feasible and accurate estimation procedure that leads to substantial improvements in back transition estimates.
A quantum framework for likelihood ratios
NASA Astrophysics Data System (ADS)
Bond, Rachael L.; He, Yang-Hui; Ormerod, Thomas C.
The ability to calculate precise likelihood ratios is fundamental to science, from Quantum Information Theory through to Quantum State Estimation. However, there is no assumption-free statistical methodology to achieve this. For instance, in the absence of data relating to covariate overlap, the widely used Bayes’ theorem either defaults to the marginal probability driven “naive Bayes’ classifier”, or requires the use of compensatory expectation-maximization techniques. This paper takes an information-theoretic approach in developing a new statistical formula for the calculation of likelihood ratios based on the principles of quantum entanglement, and demonstrates that Bayes’ theorem is a special case of a more general quantum mechanical expression.
Maximum Likelihood and Minimum Distance Applied to Univariate Mixture Distributions.
ERIC Educational Resources Information Center
Wang, Yuh-Yin Wu; Schafer, William D.
This Monte-Carlo study compared modified Newton (NW), expectation-maximization algorithm (EM), and minimum Cramer-von Mises distance (MD), used to estimate parameters of univariate mixtures of two components. Data sets were fixed at size 160 and manipulated by mean separation, variance ratio, component proportion, and non-normality. Results…
Zou, W; Ouyang, H
2016-02-01
We propose a multiple estimation adjustment (MEA) method to correct effect overestimation due to selection bias from a hypothesis-generating study (HGS) in pharmacogenetics. MEA uses a hierarchical Bayesian approach to model individual effect estimates from maximal likelihood estimation (MLE) in a region jointly and shrinks them toward the regional effect. Unlike many methods that model a fixed selection scheme, MEA capitalizes on local multiplicity independent of selection. We compared mean square errors (MSEs) in simulated HGSs from naive MLE, MEA and a conditional likelihood adjustment (CLA) method that model threshold selection bias. We observed that MEA effectively reduced MSE from MLE on null effects with or without selection, and had a clear advantage over CLA on extreme MLE estimates from null effects under lenient threshold selection in small samples, which are common among 'top' associations from a pharmacogenetics HGS.
Stochastic control system parameter identifiability
NASA Technical Reports Server (NTRS)
Lee, C. H.; Herget, C. J.
1975-01-01
The parameter identification problem of general discrete time, nonlinear, multiple input/multiple output dynamic systems with Gaussian white distributed measurement errors is considered. The knowledge of the system parameterization was assumed to be known. Concepts of local parameter identifiability and local constrained maximum likelihood parameter identifiability were established. A set of sufficient conditions for the existence of a region of parameter identifiability was derived. A computation procedure employing interval arithmetic was provided for finding the regions of parameter identifiability. If the vector of the true parameters is locally constrained maximum likelihood (CML) identifiable, then with probability one, the vector of true parameters is a unique maximal point of the maximum likelihood function in the region of parameter identifiability and the constrained maximum likelihood estimation sequence will converge to the vector of true parameters.
Models and analysis for multivariate failure time data
NASA Astrophysics Data System (ADS)
Shih, Joanna Huang
The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
Effects of Missing Data Methods in Structural Equation Modeling with Nonnormal Longitudinal Data
ERIC Educational Resources Information Center
Shin, Tacksoo; Davison, Mark L.; Long, Jeffrey D.
2009-01-01
The purpose of this study is to investigate the effects of missing data techniques in longitudinal studies under diverse conditions. A Monte Carlo simulation examined the performance of 3 missing data methods in latent growth modeling: listwise deletion (LD), maximum likelihood estimation using the expectation and maximization algorithm with a…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kane, V.E.
1982-01-01
A class of goodness-of-fit estimators is found to provide a useful alternative in certain situations to the standard maximum likelihood method which has some undesirable estimation characteristics for estimation from the three-parameter lognormal distribution. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Filliben tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Robustness of the procedures are examined and example data sets analyzed.
Deterministic annealing for density estimation by multivariate normal mixtures
NASA Astrophysics Data System (ADS)
Kloppenburg, Martin; Tavan, Paul
1997-03-01
An approach to maximum-likelihood density estimation by mixtures of multivariate normal distributions for large high-dimensional data sets is presented. Conventionally that problem is tackled by notoriously unstable expectation-maximization (EM) algorithms. We remove these instabilities by the introduction of soft constraints, enabling deterministic annealing. Our developments are motivated by the proof that algorithmically stable fuzzy clustering methods that are derived from statistical physics analogs are special cases of EM procedures.
NASA Technical Reports Server (NTRS)
Klein, V.
1980-01-01
A frequency domain maximum likelihood method is developed for the estimation of airplane stability and control parameters from measured data. The model of an airplane is represented by a discrete-type steady state Kalman filter with time variables replaced by their Fourier series expansions. The likelihood function of innovations is formulated, and by its maximization with respect to unknown parameters the estimation algorithm is obtained. This algorithm is then simplified to the output error estimation method with the data in the form of transformed time histories, frequency response curves, or spectral and cross-spectral densities. The development is followed by a discussion on the equivalence of the cost function in the time and frequency domains, and on advantages and disadvantages of the frequency domain approach. The algorithm developed is applied in four examples to the estimation of longitudinal parameters of a general aviation airplane using computer generated and measured data in turbulent and still air. The cost functions in the time and frequency domains are shown to be equivalent; therefore, both approaches are complementary and not contradictory. Despite some computational advantages of parameter estimation in the frequency domain, this approach is limited to linear equations of motion with constant coefficients.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
NASA Astrophysics Data System (ADS)
Pickard, William F.
2004-10-01
The classical PERT inverse statistics problem requires estimation of the mean, \\skew1\\bar{m} , and standard deviation, s, of a unimodal distribution given estimates of its mode, m, and of the smallest, a, and largest, b, values likely to be encountered. After placing the problem in historical perspective and showing that it is ill-posed because it is underdetermined, this paper offers an approach to resolve the ill-posedness: (a) by interpreting a and b modes of order statistic distributions; (b) by requiring also an estimate of the number of samples, N, considered in estimating the set {m, a, b}; and (c) by maximizing a suitable likelihood, having made the traditional assumption that the underlying distribution is beta. Exact formulae relating the four parameters of the beta distribution to {m, a, b, N} and the assumed likelihood function are then used to compute the four underlying parameters of the beta distribution; and from them, \\skew1\\bar{m} and s are computed using exact formulae.
Semiparametric time-to-event modeling in the presence of a latent progression event.
Rice, John D; Tsodikov, Alex
2017-06-01
In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood-based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. © 2016, The International Biometric Society.
Wynant, Willy; Abrahamowicz, Michal
2016-11-01
Standard optimization algorithms for maximizing likelihood may not be applicable to the estimation of those flexible multivariable models that are nonlinear in their parameters. For applications where the model's structure permits separating estimation of mutually exclusive subsets of parameters into distinct steps, we propose the alternating conditional estimation (ACE) algorithm. We validate the algorithm, in simulations, for estimation of two flexible extensions of Cox's proportional hazards model where the standard maximum partial likelihood estimation does not apply, with simultaneous modeling of (1) nonlinear and time-dependent effects of continuous covariates on the hazard, and (2) nonlinear interaction and main effects of the same variable. We also apply the algorithm in real-life analyses to estimate nonlinear and time-dependent effects of prognostic factors for mortality in colon cancer. Analyses of both simulated and real-life data illustrate good statistical properties of the ACE algorithm and its ability to yield new potentially useful insights about the data structure. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Wang, Shijun; Liu, Peter; Turkbey, Baris; Choyke, Peter; Pinto, Peter; Summers, Ronald M
2012-01-01
In this paper, we propose a new pharmacokinetic model for parameter estimation of dynamic contrast-enhanced (DCE) MRI by using Gaussian process inference. Our model is based on the Tofts dual-compartment model for the description of tracer kinetics and the observed time series from DCE-MRI is treated as a Gaussian stochastic process. The parameter estimation is done through a maximum likelihood approach and we propose a variant of the coordinate descent method to solve this likelihood maximization problem. The new model was shown to outperform a baseline method on simulated data. Parametric maps generated on prostate DCE data with the new model also provided better enhancement of tumors, lower intensity on false positives, and better boundary delineation when compared with the baseline method. New statistical parameter maps from the process model were also found to be informative, particularly when paired with the PK parameter maps.
Methodology and method and apparatus for signaling with capacity optimized constellations
NASA Technical Reports Server (NTRS)
Barsoum, Maged F. (Inventor); Jones, Christopher R. (Inventor)
2011-01-01
Communication systems having transmitter, includes a coder configured to receive user bits and output encoded bits at an expanded output encoded bit rate, a mapper configured to map encoded bits to symbols in a symbol constellation, a modulator configured to generate a signal for transmission via the communication channel using symbols generated by the mapper. In addition, the receiver includes a demodulator configured to demodulate the received signal via the communication channel, a demapper configured to estimate likelihoods from the demodulated signal, a decoder that is configured to estimate decoded bits from the likelihoods generated by the demapper. Furthermore, the symbol constellation is a capacity optimized geometrically spaced symbol constellation that provides a given capacity at a reduced signal-to-noise ratio compared to a signal constellation that maximizes d.sub.min.
NASA Technical Reports Server (NTRS)
Rodriguez, G.; Scheid, R. E., Jr.
1986-01-01
This paper outlines methods for modeling, identification and estimation for static determination of flexible structures. The shape estimation schemes are based on structural models specified by (possibly interconnected) elliptic partial differential equations. The identification techniques provide approximate knowledge of parameters in elliptic systems. The techniques are based on the method of maximum-likelihood that finds parameter values such that the likelihood functional associated with the system model is maximized. The estimation methods are obtained by means of a function-space approach that seeks to obtain the conditional mean of the state given the data and a white noise characterization of model errors. The solutions are obtained in a batch-processing mode in which all the data is processed simultaneously. After methods for computing the optimal estimates are developed, an analysis of the second-order statistics of the estimates and of the related estimation error is conducted. In addition to outlining the above theoretical results, the paper presents typical flexible structure simulations illustrating performance of the shape determination methods.
NASA Astrophysics Data System (ADS)
Aslan, Serdar; Taylan Cemgil, Ali; Akın, Ata
2016-08-01
Objective. In this paper, we aimed for the robust estimation of the parameters and states of the hemodynamic model by using blood oxygen level dependent signal. Approach. In the fMRI literature, there are only a few successful methods that are able to make a joint estimation of the states and parameters of the hemodynamic model. In this paper, we implemented a maximum likelihood based method called the particle smoother expectation maximization (PSEM) algorithm for the joint state and parameter estimation. Main results. Former sequential Monte Carlo methods were only reliable in the hemodynamic state estimates. They were claimed to outperform the local linearization (LL) filter and the extended Kalman filter (EKF). The PSEM algorithm is compared with the most successful method called square-root cubature Kalman smoother (SCKS) for both state and parameter estimation. SCKS was found to be better than the dynamic expectation maximization (DEM) algorithm, which was shown to be a better estimator than EKF, LL and particle filters. Significance. PSEM was more accurate than SCKS for both the state and the parameter estimation. Hence, PSEM seems to be the most accurate method for the system identification and state estimation for the hemodynamic model inversion literature. This paper do not compare its results with Tikhonov-regularized Newton—CKF (TNF-CKF), a recent robust method which works in filtering sense.
Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.
Han, Lei; Zhang, Yu; Zhang, Tong
2016-08-01
The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ 1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix. One motivation for this assumption is that the low-rank structure is common in many applications including the climate and financial analysis, and another one is that such assumption can reduce the computational complexity when computing its inverse. Specifically, we propose an efficient COmponent Pursuit (COP) method to obtain the low-rank part, where each component can be sparse. For optimization, the COP method greedily learns a rank-one component in each iteration by maximizing the log-likelihood. Moreover, the COP algorithm enjoys several appealing properties including the existence of an efficient solution in each iteration and the theoretical guarantee on the convergence of this greedy approach. Experiments on large-scale synthetic and real-world datasets including thousands of millions variables show that the COP method is faster than the state-of-the-art techniques for the inverse covariance estimation problem when achieving comparable log-likelihood on test data.
NASA Astrophysics Data System (ADS)
Dang, H.; Wang, A. S.; Sussman, Marc S.; Siewerdsen, J. H.; Stayman, J. W.
2014-09-01
Sequential imaging studies are conducted in many clinical scenarios. Prior images from previous studies contain a great deal of patient-specific anatomical information and can be used in conjunction with subsequent imaging acquisitions to maintain image quality while enabling radiation dose reduction (e.g., through sparse angular sampling, reduction in fluence, etc). However, patient motion between images in such sequences results in misregistration between the prior image and current anatomy. Existing prior-image-based approaches often include only a simple rigid registration step that can be insufficient for capturing complex anatomical motion, introducing detrimental effects in subsequent image reconstruction. In this work, we propose a joint framework that estimates the 3D deformation between an unregistered prior image and the current anatomy (based on a subsequent data acquisition) and reconstructs the current anatomical image using a model-based reconstruction approach that includes regularization based on the deformed prior image. This framework is referred to as deformable prior image registration, penalized-likelihood estimation (dPIRPLE). Central to this framework is the inclusion of a 3D B-spline-based free-form-deformation model into the joint registration-reconstruction objective function. The proposed framework is solved using a maximization strategy whereby alternating updates to the registration parameters and image estimates are applied allowing for improvements in both the registration and reconstruction throughout the optimization process. Cadaver experiments were conducted on a cone-beam CT testbench emulating a lung nodule surveillance scenario. Superior reconstruction accuracy and image quality were demonstrated using the dPIRPLE algorithm as compared to more traditional reconstruction methods including filtered backprojection, penalized-likelihood estimation (PLE), prior image penalized-likelihood estimation (PIPLE) without registration, and prior image penalized-likelihood estimation with rigid registration of a prior image (PIRPLE) over a wide range of sampling sparsity and exposure levels.
A Variance Distribution Model of Surface EMG Signals Based on Inverse Gamma Distribution.
Hayashi, Hideaki; Furui, Akira; Kurita, Yuichi; Tsuji, Toshio
2017-11-01
Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force. Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force.
Yang, Defu; Wang, Lin; Chen, Dongmei; Yan, Chenggang; He, Xiaowei; Liang, Jimin; Chen, Xueli
2018-05-17
The reconstruction of bioluminescence tomography (BLT) is severely ill-posed due to the insufficient measurements and diffuses nature of the light propagation. Predefined permissible source region (PSR) combined with regularization terms is one common strategy to reduce such ill-posedness. However, the region of PSR is usually hard to determine and can be easily affected by subjective consciousness. Hence, we theoretically developed a filtered maximum likelihood expectation maximization (fMLEM) method for BLT. Our method can avoid predefining the PSR and provide a robust and accurate result for global reconstruction. In the method, the simplified spherical harmonics approximation (SP N ) was applied to characterize diffuse light propagation in medium, and the statistical estimation-based MLEM algorithm combined with a filter function was used to solve the inverse problem. We systematically demonstrated the performance of our method by the regular geometry- and digital mouse-based simulations and a liver cancer-based in vivo experiment. Graphical abstract The filtered MLEM-based global reconstruction method for BLT.
Likelihood-based modification of experimental crystal structure electron density maps
Terwilliger, Thomas C [Sante Fe, NM
2005-04-16
A maximum-likelihood method for improves an electron density map of an experimental crystal structure. A likelihood of a set of structure factors {F.sub.h } is formed for the experimental crystal structure as (1) the likelihood of having obtained an observed set of structure factors {F.sub.h.sup.OBS } if structure factor set {F.sub.h } was correct, and (2) the likelihood that an electron density map resulting from {F.sub.h } is consistent with selected prior knowledge about the experimental crystal structure. The set of structure factors {F.sub.h } is then adjusted to maximize the likelihood of {F.sub.h } for the experimental crystal structure. An improved electron density map is constructed with the maximized structure factors.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Estimation of correlation functions by stochastic approximation.
NASA Technical Reports Server (NTRS)
Habibi, A.; Wintz, P. A.
1972-01-01
Consideration of the autocorrelation function of a zero-mean stationary random process. The techniques are applicable to processes with nonzero mean provided the mean is estimated first and subtracted. Two recursive techniques are proposed, both of which are based on the method of stochastic approximation and assume a functional form for the correlation function that depends on a number of parameters that are recursively estimated from successive records. One technique uses a standard point estimator of the correlation function to provide estimates of the parameters that minimize the mean-square error between the point estimates and the parametric function. The other technique provides estimates of the parameters that maximize a likelihood function relating the parameters of the function to the random process. Examples are presented.
Scanning linear estimation: improvements over region of interest (ROI) methods
NASA Astrophysics Data System (ADS)
Kupinski, Meredith K.; Clarkson, Eric W.; Barrett, Harrison H.
2013-03-01
In tomographic medical imaging, a signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is unbiased, i.e. the average estimate equals the true value. By contrast, unpredictable bias arising from the null functions of the imaging system affect standard algorithms that operate on reconstructed data. The SL method is demonstrated for two different tasks: (1) simultaneously estimating a signal’s size, location and activity; (2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution small-animal SPECT imaging system. For both tasks, the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and maximum values within the region of interest (ROI) are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor the response to therapy, the activity estimation task is repeated for three different signal sizes.
Dugué, Audrey Emmanuelle; Pulido, Marina; Chabaud, Sylvie; Belin, Lisa; Gal, Jocelyn
2016-12-01
We describe how to estimate progression-free survival while dealing with interval-censored data in the setting of clinical trials in oncology. Three procedures with SAS and R statistical software are described: one allowing for a nonparametric maximum likelihood estimation of the survival curve using the EM-ICM (Expectation and Maximization-Iterative Convex Minorant) algorithm as described by Wellner and Zhan in 1997; a sensitivity analysis procedure in which the progression time is assigned (i) at the midpoint, (ii) at the upper limit (reflecting the standard analysis when the progression time is assigned at the first radiologic exam showing progressive disease), or (iii) at the lower limit of the censoring interval; and finally, two multiple imputations are described considering a uniform or the nonparametric maximum likelihood estimation (NPMLE) distribution. Clin Cancer Res; 22(23); 5629-35. ©2016 AACR. ©2016 American Association for Cancer Research.
A simple, remote, video based breathing monitor.
Regev, Nir; Wulich, Dov
2017-07-01
Breathing monitors have become the all-important cornerstone of a wide variety of commercial and personal safety applications, ranging from elderly care to baby monitoring. Many such monitors exist in the market, some, with vital signs monitoring capabilities, but none remote. This paper presents a simple, yet efficient, real time method of extracting the subject's breathing sinus rhythm. Points of interest are detected on the subject's body, and the corresponding optical flow is estimated and tracked using the well known Lucas-Kanade algorithm on a frame by frame basis. A generalized likelihood ratio test is then utilized on each of the many interest points to detect which is moving in harmonic fashion. Finally, a spectral estimation algorithm based on Pisarenko harmonic decomposition tracks the harmonic frequency in real time, and a fusion maximum likelihood algorithm optimally estimates the breathing rate using all points considered. The results show a maximal error of 1 BPM between the true breathing rate and the algorithm's calculated rate, based on experiments on two babies and three adults.
Depaoli, Sarah
2013-06-01
Growth mixture modeling (GMM) represents a technique that is designed to capture change over time for unobserved subgroups (or latent classes) that exhibit qualitatively different patterns of growth. The aim of the current article was to explore the impact of latent class separation (i.e., how similar growth trajectories are across latent classes) on GMM performance. Several estimation conditions were compared: maximum likelihood via the expectation maximization (EM) algorithm and the Bayesian framework implementing diffuse priors, "accurate" informative priors, weakly informative priors, data-driven informative priors, priors reflecting partial-knowledge of parameters, and "inaccurate" (but informative) priors. The main goal was to provide insight about the optimal estimation condition under different degrees of latent class separation for GMM. Results indicated that optimal parameter recovery was obtained though the Bayesian approach using "accurate" informative priors, and partial-knowledge priors showed promise for the recovery of the growth trajectory parameters. Maximum likelihood and the remaining Bayesian estimation conditions yielded poor parameter recovery for the latent class proportions and the growth trajectories. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Weak value amplification considered harmful
NASA Astrophysics Data System (ADS)
Ferrie, Christopher; Combes, Joshua
2014-03-01
We show using statistically rigorous arguments that the technique of weak value amplification does not perform better than standard statistical techniques for the tasks of parameter estimation and signal detection. We show that using all data and considering the joint distribution of all measurement outcomes yields the optimal estimator. Moreover, we show estimation using the maximum likelihood technique with weak values as small as possible produces better performance for quantum metrology. In doing so, we identify the optimal experimental arrangement to be the one which reveals the maximal eigenvalue of the square of system observables. We also show these conclusions do not change in the presence of technical noise.
Accounting for informatively missing data in logistic regression by means of reassessment sampling.
Lin, Ji; Lyles, Robert H
2015-05-20
We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non-ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.
Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram
2017-02-01
In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.
Parameter Estimation of Multiple Frequency-Hopping Signals with Two Sensors
Pan, Jin; Ma, Boyuan
2018-01-01
This paper essentially focuses on parameter estimation of multiple wideband emitting sources with time-varying frequencies, such as two-dimensional (2-D) direction of arrival (DOA) and signal sorting, with a low-cost circular synthetic array (CSA) consisting of only two rotating sensors. Our basic idea is to decompose the received data, which is a superimposition of phase measurements from multiple sources into separated groups and separately estimate the DOA associated with each source. Motivated by joint parameter estimation, we propose to adopt the expectation maximization (EM) algorithm in this paper; our method involves two steps, namely, the expectation-step (E-step) and the maximization (M-step). In the E-step, the correspondence of each signal with its emitting source is found. Then, in the M-step, the maximum-likelihood (ML) estimates of the DOA parameters are obtained. These two steps are iteratively and alternatively executed to jointly determine the DOAs and sort multiple signals. Closed-form DOA estimation formulae are developed by ML estimation based on phase data, which also realize an optimal estimation. Directional ambiguity is also addressed by another ML estimation method based on received complex responses. The Cramer-Rao lower bound is derived for understanding the estimation accuracy and performance comparison. The verification of the proposed method is demonstrated with simulations. PMID:29617323
Cox Regression Models with Functional Covariates for Survival Data.
Gellar, Jonathan E; Colantuoni, Elizabeth; Needham, Dale M; Crainiceanu, Ciprian M
2015-06-01
We extend the Cox proportional hazards model to cases when the exposure is a densely sampled functional process, measured at baseline. The fundamental idea is to combine penalized signal regression with methods developed for mixed effects proportional hazards models. The model is fit by maximizing the penalized partial likelihood, with smoothing parameters estimated by a likelihood-based criterion such as AIC or EPIC. The model may be extended to allow for multiple functional predictors, time varying coefficients, and missing or unequally-spaced data. Methods were inspired by and applied to a study of the association between time to death after hospital discharge and daily measures of disease severity collected in the intensive care unit, among survivors of acute respiratory distress syndrome.
Inverse Ising problem in continuous time: A latent variable approach
NASA Astrophysics Data System (ADS)
Donner, Christian; Opper, Manfred
2017-12-01
We consider the inverse Ising problem: the inference of network couplings from observed spin trajectories for a model with continuous time Glauber dynamics. By introducing two sets of auxiliary latent random variables we render the likelihood into a form which allows for simple iterative inference algorithms with analytical updates. The variables are (1) Poisson variables to linearize an exponential term which is typical for point process likelihoods and (2) Pólya-Gamma variables, which make the likelihood quadratic in the coupling parameters. Using the augmented likelihood, we derive an expectation-maximization (EM) algorithm to obtain the maximum likelihood estimate of network parameters. Using a third set of latent variables we extend the EM algorithm to sparse couplings via L1 regularization. Finally, we develop an efficient approximate Bayesian inference algorithm using a variational approach. We demonstrate the performance of our algorithms on data simulated from an Ising model. For data which are simulated from a more biologically plausible network with spiking neurons, we show that the Ising model captures well the low order statistics of the data and how the Ising couplings are related to the underlying synaptic structure of the simulated network.
Kimura, Akatsuki; Celani, Antonio; Nagao, Hiromichi; Stasevich, Timothy; Nakamura, Kazuyuki
2015-01-01
Construction of quantitative models is a primary goal of quantitative biology, which aims to understand cellular and organismal phenomena in a quantitative manner. In this article, we introduce optimization procedures to search for parameters in a quantitative model that can reproduce experimental data. The aim of optimization is to minimize the sum of squared errors (SSE) in a prediction or to maximize likelihood. A (local) maximum of likelihood or (local) minimum of the SSE can efficiently be identified using gradient approaches. Addition of a stochastic process enables us to identify the global maximum/minimum without becoming trapped in local maxima/minima. Sampling approaches take advantage of increasing computational power to test numerous sets of parameters in order to determine the optimum set. By combining Bayesian inference with gradient or sampling approaches, we can estimate both the optimum parameters and the form of the likelihood function related to the parameters. Finally, we introduce four examples of research that utilize parameter optimization to obtain biological insights from quantified data: transcriptional regulation, bacterial chemotaxis, morphogenesis, and cell cycle regulation. With practical knowledge of parameter optimization, cell and developmental biologists can develop realistic models that reproduce their observations and thus, obtain mechanistic insights into phenomena of interest.
The Maximum Likelihood Solution for Inclination-only Data
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2006-12-01
The arithmetic means of inclination-only data are known to introduce a shallowing bias. Several methods have been proposed to estimate unbiased means of the inclination along with measures of the precision. Most of the inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all these methods require various assumptions and approximations that are inappropriate for many data sets. For some steep and dispersed data sets, the estimates provided by these methods are significantly displaced from the peak of the likelihood function to systematically shallower inclinations. The problem in locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest. This is because some elements of the log-likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study we succeeded in analytically cancelling exponential elements from the likelihood function, and we are now able to calculate its value for any location in the parameter space and for any inclination-only data set, with full accuracy. Furtermore, we can now calculate the partial derivatives of the likelihood function with desired accuracy. Locating the maximum likelihood without the assumptions required by previous methods is now straight forward. The information to separate the mean inclination from the precision parameter will be lost for very steep and dispersed data sets. It is worth noting that the likelihood function always has a maximum value. However, for some dispersed and steep data sets with few samples, the likelihood function takes its highest value on the boundary of the parameter space, i.e. at inclinations of +/- 90 degrees, but with relatively well defined dispersion. Our simulations indicate that this occurs quite frequently for certain data sets, and relatively small perturbations in the data will drive the maxima to the boundary. We interpret this to indicate that, for such data sets, the information needed to separate the mean inclination and the precision parameter is permanently lost. To assess the reliability and accuracy of our method we generated large number of random Fisher-distributed data sets and used seven methods to estimate the mean inclination and precision paramenter. These comparisons are described by Levi and Arason at the 2006 AGU Fall meeting. The results of the various methods is very favourable to our new robust maximum likelihood method, which, on average, is the most reliable, and the mean inclination estimates are the least biased toward shallow values. Further information on our inclination-only analysis can be obtained from: http://www.vedur.is/~arason/paleomag
Branching-ratio approximation for the self-exciting Hawkes process
NASA Astrophysics Data System (ADS)
Hardiman, Stephen J.; Bouchaud, Jean-Philippe
2014-12-01
We introduce a model-independent approximation for the branching ratio of Hawkes self-exciting point processes. Our estimator requires knowing only the mean and variance of the event count in a sufficiently large time window, statistics that are readily obtained from empirical data. The method we propose greatly simplifies the estimation of the Hawkes branching ratio, recently proposed as a proxy for market endogeneity and formerly estimated using numerical likelihood maximization. We employ our method to support recent theoretical and experimental results indicating that the best fitting Hawkes model to describe S&P futures price changes is in fact critical (now and in the recent past) in light of the long memory of financial market activity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kane, V.E.
1979-10-01
The standard maximum likelihood and moment estimation procedures are shown to have some undesirable characteristics for estimating the parameters in a three-parameter lognormal distribution. A class of goodness-of-fit estimators is found which provides a useful alternative to the standard methods. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Shapiro-Francia tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted-order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Bias and robustness of the procedures are examined and example data sets analyzed including geochemical datamore » from the National Uranium Resource Evaluation Program.« less
Semiparametric Time-to-Event Modeling in the Presence of a Latent Progression Event
Rice, John D.; Tsodikov, Alex
2017-01-01
Summary In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood–based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. PMID:27556886
Automatic physical inference with information maximizing neural networks
NASA Astrophysics Data System (ADS)
Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.
2018-04-01
Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.
Elashoff, Robert M.; Li, Gang; Li, Ning
2009-01-01
Summary In this article we study a joint model for longitudinal measurements and competing risks survival data. Our joint model provides a flexible approach to handle possible nonignorable missing data in the longitudinal measurements due to dropout. It is also an extension of previous joint models with a single failure type, offering a possible way to model informatively censored events as a competing risk. Our model consists of a linear mixed effects submodel for the longitudinal outcome and a proportional cause-specific hazards frailty submodel (Prentice et al., 1978, Biometrics 34, 541-554) for the competing risks survival data, linked together by some latent random effects. We propose to obtain the maximum likelihood estimates of the parameters by an expectation maximization (EM) algorithm and estimate their standard errors using a profile likelihood method. The developed method works well in our simulation studies and is applied to a clinical trial for the scleroderma lung disease. PMID:18162112
Bayesian image reconstruction for improving detection performance of muon tomography.
Wang, Guobao; Schultz, Larry J; Qi, Jinyi
2009-05-01
Muon tomography is a novel technology that is being developed for detecting high-Z materials in vehicles or cargo containers. Maximum likelihood methods have been developed for reconstructing the scattering density image from muon measurements. However, the instability of maximum likelihood estimation often results in noisy images and low detectability of high-Z targets. In this paper, we propose using regularization to improve the image quality of muon tomography. We formulate the muon reconstruction problem in a Bayesian framework by introducing a prior distribution on scattering density images. An iterative shrinkage algorithm is derived to maximize the log posterior distribution. At each iteration, the algorithm obtains the maximum a posteriori update by shrinking an unregularized maximum likelihood update. Inverse quadratic shrinkage functions are derived for generalized Laplacian priors and inverse cubic shrinkage functions are derived for generalized Gaussian priors. Receiver operating characteristic studies using simulated data demonstrate that the Bayesian reconstruction can greatly improve the detection performance of muon tomography.
PREMIX: PRivacy-preserving EstiMation of Individual admiXture.
Chen, Feng; Dow, Michelle; Ding, Sijie; Lu, Yao; Jiang, Xiaoqian; Tang, Hua; Wang, Shuang
2016-01-01
In this paper we proposed a framework: PRivacy-preserving EstiMation of Individual admiXture (PREMIX) using Intel software guard extensions (SGX). SGX is a suite of software and hardware architectures to enable efficient and secure computation over confidential data. PREMIX enables multiple sites to securely collaborate on estimating individual admixture within a secure enclave inside Intel SGX. We implemented a feature selection module to identify most discriminative Single Nucleotide Polymorphism (SNP) based on informativeness and an Expectation Maximization (EM)-based Maximum Likelihood estimator to identify the individual admixture. Experimental results based on both simulation and 1000 genome data demonstrated the efficiency and accuracy of the proposed framework. PREMIX ensures a high level of security as all operations on sensitive genomic data are conducted within a secure enclave using SGX.
Estimating linear-nonlinear models using Rényi divergences
Kouh, Minjoon; Sharpee, Tatyana O.
2009-01-01
This paper compares a family of methods for characterizing neural feature selectivity using natural stimuli in the framework of the linear-nonlinear model. In this model, the spike probability depends in a nonlinear way on a small number of stimulus dimensions. The relevant stimulus dimensions can be found by optimizing a Rényi divergence that quantifies a change in the stimulus distribution associated with the arrival of single spikes. Generally, good reconstructions can be obtained based on optimization of Rényi divergence of any order, even in the limit of small numbers of spikes. However, the smallest error is obtained when the Rényi divergence of order 1 is optimized. This type of optimization is equivalent to information maximization, and is shown to saturate the Cramér-Rao bound describing the smallest error allowed for any unbiased method. We also discuss conditions under which information maximization provides a convenient way to perform maximum likelihood estimation of linear-nonlinear models from neural data. PMID:19568981
Estimating linear-nonlinear models using Renyi divergences.
Kouh, Minjoon; Sharpee, Tatyana O
2009-01-01
This article compares a family of methods for characterizing neural feature selectivity using natural stimuli in the framework of the linear-nonlinear model. In this model, the spike probability depends in a nonlinear way on a small number of stimulus dimensions. The relevant stimulus dimensions can be found by optimizing a Rényi divergence that quantifies a change in the stimulus distribution associated with the arrival of single spikes. Generally, good reconstructions can be obtained based on optimization of Rényi divergence of any order, even in the limit of small numbers of spikes. However, the smallest error is obtained when the Rényi divergence of order 1 is optimized. This type of optimization is equivalent to information maximization, and is shown to saturate the Cramer-Rao bound describing the smallest error allowed for any unbiased method. We also discuss conditions under which information maximization provides a convenient way to perform maximum likelihood estimation of linear-nonlinear models from neural data.
Efficient Maximum Likelihood Estimation for Pedigree Data with the Sum-Product Algorithm.
Engelhardt, Alexander; Rieger, Anna; Tresch, Achim; Mansmann, Ulrich
2016-01-01
We analyze data sets consisting of pedigrees with age at onset of colorectal cancer (CRC) as phenotype. The occurrence of familial clusters of CRC suggests the existence of a latent, inheritable risk factor. We aimed to compute the probability of a family possessing this risk factor as well as the hazard rate increase for these risk factor carriers. Due to the inheritability of this risk factor, the estimation necessitates a costly marginalization of the likelihood. We propose an improved EM algorithm by applying factor graphs and the sum-product algorithm in the E-step. This reduces the computational complexity from exponential to linear in the number of family members. Our algorithm is as precise as a direct likelihood maximization in a simulation study and a real family study on CRC risk. For 250 simulated families of size 19 and 21, the runtime of our algorithm is faster by a factor of 4 and 29, respectively. On the largest family (23 members) in the real data, our algorithm is 6 times faster. We introduce a flexible and runtime-efficient tool for statistical inference in biomedical event data with latent variables that opens the door for advanced analyses of pedigree data. © 2017 S. Karger AG, Basel.
NASA Astrophysics Data System (ADS)
Klein, Iris M.; Rousseau, Alain N.; Frigon, Anne; Freudiger, Daphné; Gagnon, Patrick
2016-06-01
Probable maximum snow accumulation (PMSA) is one of the key variables used to estimate the spring probable maximum flood (PMF). A robust methodology for evaluating the PMSA is imperative so the ensuing spring PMF is a reasonable estimation. This is of particular importance in times of climate change (CC) since it is known that solid precipitation in Nordic landscapes will in all likelihood change over the next century. In this paper, a PMSA methodology based on simulated data from regional climate models is developed. Moisture maximization represents the core concept of the proposed methodology; precipitable water being the key variable. Results of stationarity tests indicate that CC will affect the monthly maximum precipitable water and, thus, the ensuing ratio to maximize important snowfall events. Therefore, a non-stationary approach is used to describe the monthly maximum precipitable water. Outputs from three simulations produced by the Canadian Regional Climate Model were used to give first estimates of potential PMSA changes for southern Quebec, Canada. A sensitivity analysis of the computed PMSA was performed with respect to the number of time-steps used (so-called snowstorm duration) and the threshold for a snowstorm to be maximized or not. The developed methodology is robust and a powerful tool to estimate the relative change of the PMSA. Absolute results are in the same order of magnitude as those obtained with the traditional method and observed data; but are also found to depend strongly on the climate projection used and show spatial variability.
González, M; Gutiérrez, C; Martínez, R
2012-09-01
A two-dimensional bisexual branching process has recently been presented for the analysis of the generation-to-generation evolution of the number of carriers of a Y-linked gene. In this model, preference of females for males with a specific genetic characteristic is assumed to be determined by an allele of the gene. It has been shown that the behavior of this kind of Y-linked gene is strongly related to the reproduction law of each genotype. In practice, the corresponding offspring distributions are usually unknown, and it is necessary to develop their estimation theory in order to determine the natural selection of the gene. Here we deal with the estimation problem for the offspring distribution of each genotype of a Y-linked gene when the only observable data are each generation's total numbers of males of each genotype and of females. We set out the problem in a non parametric framework and obtain the maximum likelihood estimators of the offspring distributions using an expectation-maximization algorithm. From these estimators, we also derive the estimators for the reproduction mean of each genotype and forecast the distribution of the future population sizes. Finally, we check the accuracy of the algorithm by means of a simulation study.
Yang, Huan; Meijer, Hil G E; Buitenweg, Jan R; van Gils, Stephan A
2016-01-01
Healthy or pathological states of nociceptive subsystems determine different stimulus-response relations measured from quantitative sensory testing. In turn, stimulus-response measurements may be used to assess these states. In a recently developed computational model, six model parameters characterize activation of nerve endings and spinal neurons. However, both model nonlinearity and limited information in yes-no detection responses to electrocutaneous stimuli challenge to estimate model parameters. Here, we address the question whether and how one can overcome these difficulties for reliable parameter estimation. First, we fit the computational model to experimental stimulus-response pairs by maximizing the likelihood. To evaluate the balance between model fit and complexity, i.e., the number of model parameters, we evaluate the Bayesian Information Criterion. We find that the computational model is better than a conventional logistic model regarding the balance. Second, our theoretical analysis suggests to vary the pulse width among applied stimuli as a necessary condition to prevent structural non-identifiability. In addition, the numerically implemented profile likelihood approach reveals structural and practical non-identifiability. Our model-based approach with integration of psychophysical measurements can be useful for a reliable assessment of states of the nociceptive system.
Robust generative asymmetric GMM for brain MR image segmentation.
Ji, Zexuan; Xia, Yong; Zheng, Yuhui
2017-11-01
Accurate segmentation of brain tissues from magnetic resonance (MR) images based on the unsupervised statistical models such as Gaussian mixture model (GMM) has been widely studied during last decades. However, most GMM based segmentation methods suffer from limited accuracy due to the influences of noise and intensity inhomogeneity in brain MR images. To further improve the accuracy for brain MR image segmentation, this paper presents a Robust Generative Asymmetric GMM (RGAGMM) for simultaneous brain MR image segmentation and intensity inhomogeneity correction. First, we develop an asymmetric distribution to fit the data shapes, and thus construct a spatial constrained asymmetric model. Then, we incorporate two pseudo-likelihood quantities and bias field estimation into the model's log-likelihood, aiming to exploit the neighboring priors of within-cluster and between-cluster and to alleviate the impact of intensity inhomogeneity, respectively. Finally, an expectation maximization algorithm is derived to iteratively maximize the approximation of the data log-likelihood function to overcome the intensity inhomogeneity in the image and segment the brain MR images simultaneously. To demonstrate the performances of the proposed algorithm, we first applied the proposed algorithm to a synthetic brain MR image to show the intermediate illustrations and the estimated distribution of the proposed algorithm. The next group of experiments is carried out in clinical 3T-weighted brain MR images which contain quite serious intensity inhomogeneity and noise. Then we quantitatively compare our algorithm to state-of-the-art segmentation approaches by using Dice coefficient (DC) on benchmark images obtained from IBSR and BrainWeb with different level of noise and intensity inhomogeneity. The comparison results on various brain MR images demonstrate the superior performances of the proposed algorithm in dealing with the noise and intensity inhomogeneity. In this paper, the RGAGMM algorithm is proposed which can simply and efficiently incorporate spatial constraints into an EM framework to simultaneously segment brain MR images and estimate the intensity inhomogeneity. The proposed algorithm is flexible to fit the data shapes, and can simultaneously overcome the influence of noise and intensity inhomogeneity, and hence is capable of improving over 5% segmentation accuracy comparing with several state-of-the-art algorithms. Copyright © 2017 Elsevier B.V. All rights reserved.
Davidov, Ori; Rosen, Sophia
2011-04-01
In medical studies, endpoints are often measured for each patient longitudinally. The mixed-effects model has been a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, in hearing loss studies, we expect hearing to deteriorate with time. This means that hearing thresholds which reflect hearing acuity will, on average, increase over time. Therefore, the regression coefficients associated with the mean effect of time on hearing ability will be constrained. Such constraints should be accounted for in the analysis. We propose maximum likelihood estimation procedures, based on the expectation-conditional maximization either algorithm, to estimate the parameters of the model while accounting for the constraints on them. The proposed methods improve, in terms of mean square error, on the unconstrained estimators. In some settings, the improvement may be substantial. Hypotheses testing procedures that incorporate the constraints are developed. Specifically, likelihood ratio, Wald, and score tests are proposed and investigated. Their empirical significance levels and power are studied using simulations. It is shown that incorporating the constraints improves the mean squared error of the estimates and the power of the tests. These improvements may be substantial. The methodology is used to analyze a hearing loss study.
Static shape control for flexible structures
NASA Technical Reports Server (NTRS)
Rodriguez, G.; Scheid, R. E., Jr.
1986-01-01
An integrated methodology is described for defining static shape control laws for large flexible structures. The techniques include modeling, identifying and estimating the control laws of distributed systems characterized in terms of infinite dimensional state and parameter spaces. The models are expressed as interconnected elliptic partial differential equations governing a range of static loads, with the capability of analyzing electromagnetic fields around antenna systems. A second-order analysis is carried out for statistical errors, and model parameters are determined by maximizing an appropriate defined likelihood functional which adjusts the model to observational data. The parameter estimates are derived from the conditional mean of the observational data, resulting in a least squares superposition of shape functions obtained from the structural model.
A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.
Choi, Yoonha; Coram, Marc; Peng, Jie; Tang, Hua
2017-07-01
Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.
Multi-Contrast Multi-Atlas Parcellation of Diffusion Tensor Imaging of the Human Brain
Tang, Xiaoying; Yoshida, Shoko; Hsu, John; Huisman, Thierry A. G. M.; Faria, Andreia V.; Oishi, Kenichi; Kutten, Kwame; Poretti, Andrea; Li, Yue; Miller, Michael I.; Mori, Susumu
2014-01-01
In this paper, we propose a novel method for parcellating the human brain into 193 anatomical structures based on diffusion tensor images (DTIs). This was accomplished in the setting of multi-contrast diffeomorphic likelihood fusion using multiple DTI atlases. DTI images are modeled as high dimensional fields, with each voxel exhibiting a vector valued feature comprising of mean diffusivity (MD), fractional anisotropy (FA), and fiber angle. For each structure, the probability distribution of each element in the feature vector is modeled as a mixture of Gaussians, the parameters of which are estimated from the labeled atlases. The structure-specific feature vector is then used to parcellate the test image. For each atlas, a likelihood is iteratively computed based on the structure-specific vector feature. The likelihoods from multiple atlases are then fused. The updating and fusing of the likelihoods is achieved based on the expectation-maximization (EM) algorithm for maximum a posteriori (MAP) estimation problems. We first demonstrate the performance of the algorithm by examining the parcellation accuracy of 18 structures from 25 subjects with a varying degree of structural abnormality. Dice values ranging 0.8–0.9 were obtained. In addition, strong correlation was found between the volume size of the automated and the manual parcellation. Then, we present scan-rescan reproducibility based on another dataset of 16 DTI images – an average of 3.73%, 1.91%, and 1.79% for volume, mean FA, and mean MD respectively. Finally, the range of anatomical variability in the normal population was quantified for each structure. PMID:24809486
NASA Astrophysics Data System (ADS)
Perlovsky, Leonid I.; Webb, Virgil H.; Bradley, Scott R.; Hansen, Christopher A.
1998-07-01
An advanced detection and tracking system is being developed for the U.S. Navy's Relocatable Over-the-Horizon Radar (ROTHR) to provide improved tracking performance against small aircraft typically used in drug-smuggling activities. The development is based on the Maximum Likelihood Adaptive Neural System (MLANS), a model-based neural network that combines advantages of neural network and model-based algorithmic approaches. The objective of the MLANS tracker development effort is to address user requirements for increased detection and tracking capability in clutter and improved track position, heading, and speed accuracy. The MLANS tracker is expected to outperform other approaches to detection and tracking for the following reasons. It incorporates adaptive internal models of target return signals, target tracks and maneuvers, and clutter signals, which leads to concurrent clutter suppression, detection, and tracking (track-before-detect). It is not combinatorial and thus does not require any thresholding or peak picking and can track in low signal-to-noise conditions. It incorporates superresolution spectrum estimation techniques exceeding the performance of conventional maximum likelihood and maximum entropy methods. The unique spectrum estimation method is based on the Einsteinian interpretation of the ROTHR received energy spectrum as a probability density of signal frequency. The MLANS neural architecture and learning mechanism are founded on spectrum models and maximization of the "Einsteinian" likelihood, allowing knowledge of the physical behavior of both targets and clutter to be injected into the tracker algorithms. The paper describes the addressed requirements and expected improvements, theoretical foundations, engineering methodology, and results of the development effort to date.
Pal, Suvra; Balakrishnan, N
2017-10-01
In this paper, we consider a competing cause scenario and assume the number of competing causes to follow a Conway-Maxwell Poisson distribution which can capture both over and under dispersion that is usually encountered in discrete data. Assuming the population of interest having a component cure and the form of the data to be interval censored, as opposed to the usually considered right-censored data, the main contribution is in developing the steps of the expectation maximization algorithm for the determination of the maximum likelihood estimates of the model parameters of the flexible Conway-Maxwell Poisson cure rate model with Weibull lifetimes. An extensive Monte Carlo simulation study is carried out to demonstrate the performance of the proposed estimation method. Model discrimination within the Conway-Maxwell Poisson distribution is addressed using the likelihood ratio test and information-based criteria to select a suitable competing cause distribution that provides the best fit to the data. A simulation study is also carried out to demonstrate the loss in efficiency when selecting an improper competing cause distribution which justifies the use of a flexible family of distributions for the number of competing causes. Finally, the proposed methodology and the flexibility of the Conway-Maxwell Poisson distribution are illustrated with two known data sets from the literature: smoking cessation data and breast cosmesis data.
Improving z-tracking accuracy in the two-photon single-particle tracking microscope.
Liu, C; Liu, Y-L; Perillo, E P; Jiang, N; Dunn, A K; Yeh, H-C
2015-10-12
Here, we present a method that can improve the z-tracking accuracy of the recently invented TSUNAMI (Tracking of Single particles Using Nonlinear And Multiplexed Illumination) microscope. This method utilizes a maximum likelihood estimator (MLE) to determine the particle's 3D position that maximizes the likelihood of the observed time-correlated photon count distribution. Our Monte Carlo simulations show that the MLE-based tracking scheme can improve the z-tracking accuracy of TSUNAMI microscope by 1.7 fold. In addition, MLE is also found to reduce the temporal correlation of the z-tracking error. Taking advantage of the smaller and less temporally correlated z-tracking error, we have precisely recovered the hybridization-melting kinetics of a DNA model system from thousands of short single-particle trajectories in silico . Our method can be generally applied to other 3D single-particle tracking techniques.
Monte Carlo-based Reconstruction in Water Cherenkov Detectors using Chroma
NASA Astrophysics Data System (ADS)
Seibert, Stanley; Latorre, Anthony
2012-03-01
We demonstrate the feasibility of event reconstruction---including position, direction, energy and particle identification---in water Cherenkov detectors with a purely Monte Carlo-based method. Using a fast optical Monte Carlo package we have written, called Chroma, in combination with several variance reduction techniques, we can estimate the value of a likelihood function for an arbitrary event hypothesis. The likelihood can then be maximized over the parameter space of interest using a form of gradient descent designed for stochastic functions. Although slower than more traditional reconstruction algorithms, this completely Monte Carlo-based technique is universal and can be applied to a detector of any size or shape, which is a major advantage during the design phase of an experiment. As a specific example, we focus on reconstruction results from a simulation of the 200 kiloton water Cherenkov far detector option for LBNE.
Improving z-tracking accuracy in the two-photon single-particle tracking microscope
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, C.; Liu, Y.-L.; Perillo, E. P.
Here, we present a method that can improve the z-tracking accuracy of the recently invented TSUNAMI (Tracking of Single particles Using Nonlinear And Multiplexed Illumination) microscope. This method utilizes a maximum likelihood estimator (MLE) to determine the particle's 3D position that maximizes the likelihood of the observed time-correlated photon count distribution. Our Monte Carlo simulations show that the MLE-based tracking scheme can improve the z-tracking accuracy of TSUNAMI microscope by 1.7 fold. In addition, MLE is also found to reduce the temporal correlation of the z-tracking error. Taking advantage of the smaller and less temporally correlated z-tracking error, we havemore » precisely recovered the hybridization-melting kinetics of a DNA model system from thousands of short single-particle trajectories in silico. Our method can be generally applied to other 3D single-particle tracking techniques.« less
Spatiotemporal reconstruction of list-mode PET data.
Nichols, Thomas E; Qi, Jinyi; Asma, Evren; Leahy, Richard M
2002-04-01
We describe a method for computing a continuous time estimate of tracer density using list-mode positron emission tomography data. The rate function in each voxel is modeled as an inhomogeneous Poisson process whose rate function can be represented using a cubic B-spline basis. The rate functions are estimated by maximizing the likelihood of the arrival times of detected photon pairs over the control vertices of the spline, modified by quadratic spatial and temporal smoothness penalties and a penalty term to enforce nonnegativity. Randoms rate functions are estimated by assuming independence between the spatial and temporal randoms distributions. Similarly, scatter rate functions are estimated by assuming spatiotemporal independence and that the temporal distribution of the scatter is proportional to the temporal distribution of the trues. A quantitative evaluation was performed using simulated data and the method is also demonstrated in a human study using 11C-raclopride.
Estimating Likelihood of Fetal In Vivo Interactions Using In ...
Tox21/ToxCast efforts provide in vitro concentration-response data for thousands of compounds. Predicting whether chemical-biological interactions observed in vitro will occur in vivo is challenging. We hypothesize that using a modified model from the FDA guidance for drug interaction studies, Cmax/AC50 (i.e., maximal in vivo blood concentration over the half-maximal in in vitro activity concentration), will give a useful approximation for concentrations where in vivo interactions are likely. Further, for doses where maternal blood concentrations are likely to elicit an interaction (Cmax/AC50>0.1), where do the compounds accumulate in fetal tissues? In order to estimate these doses based on Tox21 data, in silico parameters of chemical fraction unbound in plasma and intrinsic hepatic clearance were estimated from ADMET predictor (Simulations-Plus Inc.) and used in the HTTK R-package to obtain Cmax values from a physiologically-based toxicokinetics model. In silico estimated Cmax values predicted in vivo human Cmax with median absolute error of 0.81 for 93 chemicals, giving confidence in the R-package and in silico estimates. A case example evaluating Cmax/AC50 values for peroxisome proliferator-activated receptor gamma (PPARγ) and glucocorticoid receptor revealed known compounds (glitazones and corticosteroids, respectively) highest on the list at pharmacological doses. Doses required to elicit likely interactions across all Tox21/ToxCast assays were compared to
Parameter Estimation for Thurstone Choice Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vojnovic, Milan; Yun, Seyoung
We consider the estimation accuracy of individual strength parameters of a Thurstone choice model when each input observation consists of a choice of one item from a set of two or more items (so called top-1 lists). This model accommodates the well-known choice models such as the Luce choice model for comparison sets of two or more items and the Bradley-Terry model for pair comparisons. We provide a tight characterization of the mean squared error of the maximum likelihood parameter estimator. We also provide similar characterizations for parameter estimators defined by a rank-breaking method, which amounts to deducing one ormore » more pair comparisons from a comparison of two or more items, assuming independence of these pair comparisons, and maximizing a likelihood function derived under these assumptions. We also consider a related binary classification problem where each individual parameter takes value from a set of two possible values and the goal is to correctly classify all items within a prescribed classification error. The results of this paper shed light on how the parameter estimation accuracy depends on given Thurstone choice model and the structure of comparison sets. In particular, we found that for unbiased input comparison sets of a given cardinality, when in expectation each comparison set of given cardinality occurs the same number of times, for a broad class of Thurstone choice models, the mean squared error decreases with the cardinality of comparison sets, but only marginally according to a diminishing returns relation. On the other hand, we found that there exist Thurstone choice models for which the mean squared error of the maximum likelihood parameter estimator can decrease much faster with the cardinality of comparison sets. We report empirical evaluation of some claims and key parameters revealed by theory using both synthetic and real-world input data from some popular sport competitions and online labor platforms.« less
Linkage disequilibrium interval mapping of quantitative trait loci.
Boitard, Simon; Abdallah, Jihad; de Rochambeau, Hubert; Cierco-Ayrolles, Christine; Mangin, Brigitte
2006-03-16
For many years gene mapping studies have been performed through linkage analyses based on pedigree data. Recently, linkage disequilibrium methods based on unrelated individuals have been advocated as powerful tools to refine estimates of gene location. Many strategies have been proposed to deal with simply inherited disease traits. However, locating quantitative trait loci is statistically more challenging and considerable research is needed to provide robust and computationally efficient methods. Under a three-locus Wright-Fisher model, we derived approximate expressions for the expected haplotype frequencies in a population. We considered haplotypes comprising one trait locus and two flanking markers. Using these theoretical expressions, we built a likelihood-maximization method, called HAPim, for estimating the location of a quantitative trait locus. For each postulated position, the method only requires information from the two flanking markers. Over a wide range of simulation scenarios it was found to be more accurate than a two-marker composite likelihood method. It also performed as well as identity by descent methods, whilst being valuable in a wider range of populations. Our method makes efficient use of marker information, and can be valuable for fine mapping purposes. Its performance is increased if multiallelic markers are available. Several improvements can be developed to account for more complex evolution scenarios or provide robust confidence intervals for the location estimates.
Ng, S K; McLachlan, G J
2003-04-15
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright 2003 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Lennington, R. K.; Malek, H.
1978-01-01
A clustering method, CLASSY, was developed, which alternates maximum likelihood iteration with a procedure for splitting, combining, and eliminating the resulting statistics. The method maximizes the fit of a mixture of normal distributions to the observed first through fourth central moments of the data and produces an estimate of the proportions, means, and covariances in this mixture. The mathematical model which is the basic for CLASSY and the actual operation of the algorithm is described. Data comparing the performances of CLASSY and ISOCLS on simulated and actual LACIE data are presented.
Fast estimation of diffusion tensors under Rician noise by the EM algorithm.
Liu, Jia; Gasbarra, Dario; Railavo, Juha
2016-01-15
Diffusion tensor imaging (DTI) is widely used to characterize, in vivo, the white matter of the central nerve system (CNS). This biological tissue contains much anatomic, structural and orientational information of fibers in human brain. Spectral data from the displacement distribution of water molecules located in the brain tissue are collected by a magnetic resonance scanner and acquired in the Fourier domain. After the Fourier inversion, the noise distribution is Gaussian in both real and imaginary parts and, as a consequence, the recorded magnitude data are corrupted by Rician noise. Statistical estimation of diffusion leads a non-linear regression problem. In this paper, we present a fast computational method for maximum likelihood estimation (MLE) of diffusivities under the Rician noise model based on the expectation maximization (EM) algorithm. By using data augmentation, we are able to transform a non-linear regression problem into the generalized linear modeling framework, reducing dramatically the computational cost. The Fisher-scoring method is used for achieving fast convergence of the tensor parameter. The new method is implemented and applied using both synthetic and real data in a wide range of b-amplitudes up to 14,000s/mm(2). Higher accuracy and precision of the Rician estimates are achieved compared with other log-normal based methods. In addition, we extend the maximum likelihood (ML) framework to the maximum a posteriori (MAP) estimation in DTI under the aforementioned scheme by specifying the priors. We will describe how close numerically are the estimators of model parameters obtained through MLE and MAP estimation. Copyright © 2015 Elsevier B.V. All rights reserved.
IMNN: Information Maximizing Neural Networks
NASA Astrophysics Data System (ADS)
Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.
2018-04-01
This software trains artificial neural networks to find non-linear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). As compressing large data sets vastly simplifies both frequentist and Bayesian inference, important information may be inadvertently missed. Likelihood-free inference based on automatically derived IMNN summaries produces summaries that are good approximations to sufficient statistics. IMNNs are robustly capable of automatically finding optimal, non-linear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima.
MXLKID: a maximum likelihood parameter identifier. [In LRLTRAN for CDC 7600
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gavel, D.T.
MXLKID (MaXimum LiKelihood IDentifier) is a computer program designed to identify unknown parameters in a nonlinear dynamic system. Using noisy measurement data from the system, the maximum likelihood identifier computes a likelihood function (LF). Identification of system parameters is accomplished by maximizing the LF with respect to the parameters. The main body of this report briefly summarizes the maximum likelihood technique and gives instructions and examples for running the MXLKID program. MXLKID is implemented LRLTRAN on the CDC7600 computer at LLNL. A detailed mathematical description of the algorithm is given in the appendices. 24 figures, 6 tables.
Dual-Particle Imaging System with Neutron Spectroscopy for Safeguard Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamel, Michael C.; Weber, Thomas M.
2017-11-01
A dual-particle imager (DPI) has been designed that is capable of detecting gamma-ray and neutron signatures from shielded SNM. The system combines liquid organic and NaI(Tl) scintillators to form a combined Compton and neutron scatter camera. Effective image reconstruction of detected particles is a crucial component for maximizing the performance of the system; however, a key deficiency exists in the widely used iterative list-mode maximum-likelihood estimation-maximization (MLEM) image reconstruction technique. For MLEM a stopping condition is required to achieve a good quality solution but these conditions fail to achieve maximum image quality. Stochastic origin ensembles (SOE) imaging is a goodmore » candidate to address this problem as it uses Markov chain Monte Carlo to reach a stochastic steady-state solution. The application of SOE to the DPI is presented in this work.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Wei-Chen; Maitra, Ranjan
2011-01-01
We propose a model-based approach for clustering time series regression data in an unsupervised machine learning framework to identify groups under the assumption that each mixture component follows a Gaussian autoregressive regression model of order p. Given the number of groups, the traditional maximum likelihood approach of estimating the parameters using the expectation-maximization (EM) algorithm can be employed, although it is computationally demanding. The somewhat fast tune to the EM folk song provided by the Alternating Expectation Conditional Maximization (AECM) algorithm can alleviate the problem to some extent. In this article, we develop an alternative partial expectation conditional maximization algorithmmore » (APECM) that uses an additional data augmentation storage step to efficiently implement AECM for finite mixture models. Results on our simulation experiments show improved performance in both fewer numbers of iterations and computation time. The methodology is applied to the problem of clustering mutual funds data on the basis of their average annual per cent returns and in the presence of economic indicators.« less
Bromaghin, Jeffrey F.; Gates, Kenneth S.; Palmer, Douglas E.
2010-01-01
Many fisheries for Pacific salmon Oncorhynchus spp. are actively managed to meet escapement goal objectives. In fisheries where the demand for surplus production is high, an extensive assessment program is needed to achieve the opposing objectives of allowing adequate escapement and fully exploiting the available surplus. Knowledge of abundance is a critical element of such assessment programs. Abundance estimation using mark—recapture experiments in combination with telemetry has become common in recent years, particularly within Alaskan river systems. Fish are typically captured and marked in the lower river while migrating in aggregations of individuals from multiple populations. Recapture data are obtained using telemetry receivers that are co-located with abundance assessment projects near spawning areas, which provide large sample sizes and information on population-specific mark rates. When recapture data are obtained from multiple populations, unequal mark rates may reflect a violation of the assumption of homogeneous capture probabilities. A common analytical strategy is to test the hypothesis that mark rates are homogeneous and combine all recapture data if the test is not significant. However, mark rates are often low, and a test of homogeneity may lack sufficient power to detect meaningful differences among populations. In addition, differences among mark rates may provide information that could be exploited during parameter estimation. We present a temporally stratified mark—recapture model that permits capture probabilities and migratory timing through the capture area to vary among strata. Abundance information obtained from a subset of populations after the populations have segregated for spawning is jointly modeled with telemetry distribution data by use of a likelihood function. Maximization of the likelihood produces estimates of the abundance and timing of individual populations migrating through the capture area, thus yielding substantially more information than the total abundance estimate provided by the conventional approach. The utility of the model is illustrated with data for coho salmon O. kisutch from the Kasilof River in south-central Alaska.
Alam, M S; Bognar, J G; Cain, S; Yasuda, B J
1998-03-10
During the process of microscanning a controlled vibrating mirror typically is used to produce subpixel shifts in a sequence of forward-looking infrared (FLIR) images. If the FLIR is mounted on a moving platform, such as an aircraft, uncontrolled random vibrations associated with the platform can be used to generate the shifts. Iterative techniques such as the expectation-maximization (EM) approach by means of the maximum-likelihood algorithm can be used to generate high-resolution images from multiple randomly shifted aliased frames. In the maximum-likelihood approach the data are considered to be Poisson random variables and an EM algorithm is developed that iteratively estimates an unaliased image that is compensated for known imager-system blur while it simultaneously estimates the translational shifts. Although this algorithm yields high-resolution images from a sequence of randomly shifted frames, it requires significant computation time and cannot be implemented for real-time applications that use the currently available high-performance processors. The new image shifts are iteratively calculated by evaluation of a cost function that compares the shifted and interlaced data frames with the corresponding values in the algorithm's latest estimate of the high-resolution image. We present a registration algorithm that estimates the shifts in one step. The shift parameters provided by the new algorithm are accurate enough to eliminate the need for iterative recalculation of translational shifts. Using this shift information, we apply a simplified version of the EM algorithm to estimate a high-resolution image from a given sequence of video frames. The proposed modified EM algorithm has been found to reduce significantly the computational burden when compared with the original EM algorithm, thus making it more attractive for practical implementation. Both simulation and experimental results are presented to verify the effectiveness of the proposed technique.
Dynamic Histogram Analysis To Determine Free Energies and Rates from Biased Simulations.
Stelzl, Lukas S; Kells, Adam; Rosta, Edina; Hummer, Gerhard
2017-12-12
We present an algorithm to calculate free energies and rates from molecular simulations on biased potential energy surfaces. As input, it uses the accumulated times spent in each state or bin of a histogram and counts of transitions between them. Optimal unbiased equilibrium free energies for each of the states/bins are then obtained by maximizing the likelihood of a master equation (i.e., first-order kinetic rate model). The resulting free energies also determine the optimal rate coefficients for transitions between the states or bins on the biased potentials. Unbiased rates can be estimated, e.g., by imposing a linear free energy condition in the likelihood maximization. The resulting "dynamic histogram analysis method extended to detailed balance" (DHAMed) builds on the DHAM method. It is also closely related to the transition-based reweighting analysis method (TRAM) and the discrete TRAM (dTRAM). However, in the continuous-time formulation of DHAMed, the detailed balance constraints are more easily accounted for, resulting in compact expressions amenable to efficient numerical treatment. DHAMed produces accurate free energies in cases where the common weighted-histogram analysis method (WHAM) for umbrella sampling fails because of slow dynamics within the windows. Even in the limit of completely uncorrelated data, where WHAM is optimal in the maximum-likelihood sense, DHAMed results are nearly indistinguishable. We illustrate DHAMed with applications to ion channel conduction, RNA duplex formation, α-helix folding, and rate calculations from accelerated molecular dynamics. DHAMed can also be used to construct Markov state models from biased or replica-exchange molecular dynamics simulations. By using binless WHAM formulated as a numerical minimization problem, the bias factors for the individual states can be determined efficiently in a preprocessing step and, if needed, optimized globally afterward.
Combining Ratio Estimation for Low Density Parity Check (LDPC) Coding
NASA Technical Reports Server (NTRS)
Mahmoud, Saad; Hi, Jianjun
2012-01-01
The Low Density Parity Check (LDPC) Code decoding algorithm make use of a scaled receive signal derived from maximizing the log-likelihood ratio of the received signal. The scaling factor (often called the combining ratio) in an AWGN channel is a ratio between signal amplitude and noise variance. Accurately estimating this ratio has shown as much as 0.6 dB decoding performance gain. This presentation briefly describes three methods for estimating the combining ratio: a Pilot-Guided estimation method, a Blind estimation method, and a Simulation-Based Look-Up table. The Pilot Guided Estimation method has shown that the maximum likelihood estimates of signal amplitude is the mean inner product of the received sequence and the known sequence, the attached synchronization marker (ASM) , and signal variance is the difference of the mean of the squared received sequence and the square of the signal amplitude. This method has the advantage of simplicity at the expense of latency since several frames worth of ASMs. The Blind estimation method s maximum likelihood estimator is the average of the product of the received signal with the hyperbolic tangent of the product combining ratio and the received signal. The root of this equation can be determined by an iterative binary search between 0 and 1 after normalizing the received sequence. This method has the benefit of requiring one frame of data to estimate the combining ratio which is good for faster changing channels compared to the previous method, however it is computationally expensive. The final method uses a look-up table based on prior simulated results to determine signal amplitude and noise variance. In this method the received mean signal strength is controlled to a constant soft decision value. The magnitude of the deviation is averaged over a predetermined number of samples. This value is referenced in a look up table to determine the combining ratio that prior simulation associated with the average magnitude of the deviation. This method is more complicated than the Pilot-Guided Method due to the gain control circuitry, but does not have the real-time computation complexity of the Blind Estimation method. Each of these methods can be used to provide an accurate estimation of the combining ratio, and the final selection of the estimation method depends on other design constraints.
NASA Technical Reports Server (NTRS)
Howell, Leonard W., Jr.; Six, N. Frank (Technical Monitor)
2002-01-01
The Maximum Likelihood (ML) statistical theory required to estimate spectra information from an arbitrary number of astrophysics data sets produced by vastly different science instruments is developed in this paper. This theory and its successful implementation will facilitate the interpretation of spectral information from multiple astrophysics missions and thereby permit the derivation of superior spectral information based on the combination of data sets. The procedure is of significant value to both existing data sets and those to be produced by future astrophysics missions consisting of two or more detectors by allowing instrument developers to optimize each detector's design parameters through simulation studies in order to design and build complementary detectors that will maximize the precision with which the science objectives may be obtained. The benefits of this ML theory and its application is measured in terms of the reduction of the statistical errors (standard deviations) of the spectra information using the multiple data sets in concert as compared to the statistical errors of the spectra information when the data sets are considered separately, as well as any biases resulting from poor statistics in one or more of the individual data sets that might be reduced when the data sets are combined.
Adaptive pre-specification in randomized trials with and without pair-matching.
Balzer, Laura B; van der Laan, Mark J; Petersen, Maya L
2016-11-10
In randomized trials, adjustment for measured covariates during the analysis can reduce variance and increase power. To avoid misleading inference, the analysis plan must be pre-specified. However, it is often unclear a priori which baseline covariates (if any) should be adjusted for in the analysis. Consider, for example, the Sustainable East Africa Research in Community Health (SEARCH) trial for HIV prevention and treatment. There are 16 matched pairs of communities and many potential adjustment variables, including region, HIV prevalence, male circumcision coverage, and measures of community-level viral load. In this paper, we propose a rigorous procedure to data-adaptively select the adjustment set, which maximizes the efficiency of the analysis. Specifically, we use cross-validation to select from a pre-specified library the candidate targeted maximum likelihood estimator (TMLE) that minimizes the estimated variance. For further gains in precision, we also propose a collaborative procedure for estimating the known exposure mechanism. Our small sample simulations demonstrate the promise of the methodology to maximize study power, while maintaining nominal confidence interval coverage. We show how our procedure can be tailored to the scientific question (intervention effect for the study sample vs. for the target population) and study design (pair-matched or not). Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Experimental Design for Parameter Estimation of Gene Regulatory Networks
Timmer, Jens
2012-01-01
Systems biology aims for building quantitative models to address unresolved issues in molecular biology. In order to describe the behavior of biological cells adequately, gene regulatory networks (GRNs) are intensively investigated. As the validity of models built for GRNs depends crucially on the kinetic rates, various methods have been developed to estimate these parameters from experimental data. For this purpose, it is favorable to choose the experimental conditions yielding maximal information. However, existing experimental design principles often rely on unfulfilled mathematical assumptions or become computationally demanding with growing model complexity. To solve this problem, we combined advanced methods for parameter and uncertainty estimation with experimental design considerations. As a showcase, we optimized three simulated GRNs in one of the challenges from the Dialogue for Reverse Engineering Assessment and Methods (DREAM). This article presents our approach, which was awarded the best performing procedure at the DREAM6 Estimation of Model Parameters challenge. For fast and reliable parameter estimation, local deterministic optimization of the likelihood was applied. We analyzed identifiability and precision of the estimates by calculating the profile likelihood. Furthermore, the profiles provided a way to uncover a selection of most informative experiments, from which the optimal one was chosen using additional criteria at every step of the design process. In conclusion, we provide a strategy for optimal experimental design and show its successful application on three highly nonlinear dynamic models. Although presented in the context of the GRNs to be inferred for the DREAM6 challenge, the approach is generic and applicable to most types of quantitative models in systems biology and other disciplines. PMID:22815723
Learning Time-Varying Coverage Functions
Du, Nan; Liang, Yingyu; Balcan, Maria-Florina; Song, Le
2015-01-01
Coverage functions are an important class of discrete functions that capture the law of diminishing returns arising naturally from applications in social network analysis, machine learning, and algorithmic game theory. In this paper, we propose a new problem of learning time-varying coverage functions, and develop a novel parametrization of these functions using random features. Based on the connection between time-varying coverage functions and counting processes, we also propose an efficient parameter learning algorithm based on likelihood maximization, and provide a sample complexity analysis. We applied our algorithm to the influence function estimation problem in information diffusion in social networks, and show that with few assumptions about the diffusion processes, our algorithm is able to estimate influence significantly more accurately than existing approaches on both synthetic and real world data. PMID:25960624
Learning Time-Varying Coverage Functions.
Du, Nan; Liang, Yingyu; Balcan, Maria-Florina; Song, Le
2014-12-08
Coverage functions are an important class of discrete functions that capture the law of diminishing returns arising naturally from applications in social network analysis, machine learning, and algorithmic game theory. In this paper, we propose a new problem of learning time-varying coverage functions, and develop a novel parametrization of these functions using random features. Based on the connection between time-varying coverage functions and counting processes, we also propose an efficient parameter learning algorithm based on likelihood maximization, and provide a sample complexity analysis. We applied our algorithm to the influence function estimation problem in information diffusion in social networks, and show that with few assumptions about the diffusion processes, our algorithm is able to estimate influence significantly more accurately than existing approaches on both synthetic and real world data.
Bayesian Image Segmentations by Potts Prior and Loopy Belief Propagation
NASA Astrophysics Data System (ADS)
Tanaka, Kazuyuki; Kataoka, Shun; Yasuda, Muneki; Waizumi, Yuji; Hsu, Chiou-Ting
2014-12-01
This paper presents a Bayesian image segmentation model based on Potts prior and loopy belief propagation. The proposed Bayesian model involves several terms, including the pairwise interactions of Potts models, and the average vectors and covariant matrices of Gauss distributions in color image modeling. These terms are often referred to as hyperparameters in statistical machine learning theory. In order to determine these hyperparameters, we propose a new scheme for hyperparameter estimation based on conditional maximization of entropy in the Potts prior. The algorithm is given based on loopy belief propagation. In addition, we compare our conditional maximum entropy framework with the conventional maximum likelihood framework, and also clarify how the first order phase transitions in loopy belief propagations for Potts models influence our hyperparameter estimation procedures.
Joint Inference of Population Assignment and Demographic History
Choi, Sang Chul; Hey, Jody
2011-01-01
A new approach to assigning individuals to populations using genetic data is described. Most existing methods work by maximizing Hardy–Weinberg and linkage equilibrium within populations, neither of which will apply for many demographic histories. By including a demographic model, within a likelihood framework based on coalescent theory, we can jointly study demographic history and population assignment. Genealogies and population assignments are sampled from a posterior distribution using a general isolation-with-migration model for multiple populations. A measure of partition distance between assignments facilitates not only the summary of a posterior sample of assignments, but also the estimation of the posterior density for the demographic history. It is shown that joint estimates of assignment and demographic history are possible, including estimation of population phylogeny for samples from three populations. The new method is compared to results of a widely used assignment method, using simulated and published empirical data sets. PMID:21775468
Automated model selection in covariance estimation and spatial whitening of MEG and EEG signals.
Engemann, Denis A; Gramfort, Alexandre
2015-03-01
Magnetoencephalography and electroencephalography (M/EEG) measure non-invasively the weak electromagnetic fields induced by post-synaptic neural currents. The estimation of the spatial covariance of the signals recorded on M/EEG sensors is a building block of modern data analysis pipelines. Such covariance estimates are used in brain-computer interfaces (BCI) systems, in nearly all source localization methods for spatial whitening as well as for data covariance estimation in beamformers. The rationale for such models is that the signals can be modeled by a zero mean Gaussian distribution. While maximizing the Gaussian likelihood seems natural, it leads to a covariance estimate known as empirical covariance (EC). It turns out that the EC is a poor estimate of the true covariance when the number of samples is small. To address this issue the estimation needs to be regularized. The most common approach downweights off-diagonal coefficients, while more advanced regularization methods are based on shrinkage techniques or generative models with low rank assumptions: probabilistic PCA (PPCA) and factor analysis (FA). Using cross-validation all of these models can be tuned and compared based on Gaussian likelihood computed on unseen data. We investigated these models on simulations, one electroencephalography (EEG) dataset as well as magnetoencephalography (MEG) datasets from the most common MEG systems. First, our results demonstrate that different models can be the best, depending on the number of samples, heterogeneity of sensor types and noise properties. Second, we show that the models tuned by cross-validation are superior to models with hand-selected regularization. Hence, we propose an automated solution to the often overlooked problem of covariance estimation of M/EEG signals. The relevance of the procedure is demonstrated here for spatial whitening and source localization of MEG signals. Copyright © 2015 Elsevier Inc. All rights reserved.
Estimating the variance for heterogeneity in arm-based network meta-analysis.
Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R
2018-04-19
Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
Mass and Volume Optimization of Space Flight Medical Kits
NASA Technical Reports Server (NTRS)
Keenan, A. B.; Foy, Millennia Hope; Myers, Jerry
2014-01-01
Resource allocation is a critical aspect of space mission planning. All resources, including medical resources, are subject to a number of mission constraints such a maximum mass and volume. However, unlike many resources, there is often limited understanding in how to optimize medical resources for a mission. The Integrated Medical Model (IMM) is a probabilistic model that estimates medical event occurrences and mission outcomes for different mission profiles. IMM simulates outcomes and describes the impact of medical events in terms of lost crew time, medical resource usage, and the potential for medically required evacuation. Previously published work describes an approach that uses the IMM to generate optimized medical kits that maximize benefit to the crew subject to mass and volume constraints. We improve upon the results obtained previously and extend our approach to minimize mass and volume while meeting some benefit threshold. METHODS We frame the medical kit optimization problem as a modified knapsack problem and implement an algorithm utilizing dynamic programming. Using this algorithm, optimized medical kits were generated for 3 mission scenarios with the goal of minimizing the medical kit mass and volume for a specified likelihood of evacuation or Crew Health Index (CHI) threshold. The algorithm was expanded to generate medical kits that maximize likelihood of evacuation or CHI subject to mass and volume constraints. RESULTS AND CONCLUSIONS In maximizing benefit to crew health subject to certain constraints, our algorithm generates medical kits that more closely resemble the unlimited-resource scenario than previous approaches which leverage medical risk information generated by the IMM. Our work here demonstrates that this algorithm provides an efficient and effective means to objectively allocate medical resources for spaceflight missions and provides an effective means of addressing tradeoffs in medical resource allocations and crew mission success parameters.
Hansson, Lisbeth; Khamis, Harry J
2008-12-01
Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.
Contributions to the Underlying Bivariate Normal Method for Factor Analyzing Ordinal Data
ERIC Educational Resources Information Center
Xi, Nuo; Browne, Michael W.
2014-01-01
A promising "underlying bivariate normal" approach was proposed by Jöreskog and Moustaki for use in the factor analysis of ordinal data. This was a limited information approach that involved the maximization of a composite likelihood function. Its advantage over full-information maximum likelihood was that very much less computation was…
NASA Astrophysics Data System (ADS)
Zeng, X.
2015-12-01
A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Balzer, Laura; Staples, Patrick; Onnela, Jukka-Pekka; DeGruttola, Victor
2017-04-01
Several cluster-randomized trials are underway to investigate the implementation and effectiveness of a universal test-and-treat strategy on the HIV epidemic in sub-Saharan Africa. We consider nesting studies of pre-exposure prophylaxis within these trials. Pre-exposure prophylaxis is a general strategy where high-risk HIV- persons take antiretrovirals daily to reduce their risk of infection from exposure to HIV. We address how to target pre-exposure prophylaxis to high-risk groups and how to maximize power to detect the individual and combined effects of universal test-and-treat and pre-exposure prophylaxis strategies. We simulated 1000 trials, each consisting of 32 villages with 200 individuals per village. At baseline, we randomized the universal test-and-treat strategy. Then, after 3 years of follow-up, we considered four strategies for targeting pre-exposure prophylaxis: (1) all HIV- individuals who self-identify as high risk, (2) all HIV- individuals who are identified by their HIV+ partner (serodiscordant couples), (3) highly connected HIV- individuals, and (4) the HIV- contacts of a newly diagnosed HIV+ individual (a ring-based strategy). We explored two possible trial designs, and all villages were followed for a total of 7 years. For each village in a trial, we used a stochastic block model to generate bipartite (male-female) networks and simulated an agent-based epidemic process on these networks. We estimated the individual and combined intervention effects with a novel targeted maximum likelihood estimator, which used cross-validation to data-adaptively select from a pre-specified library the candidate estimator that maximized the efficiency of the analysis. The universal test-and-treat strategy reduced the 3-year cumulative HIV incidence by 4.0% on average. The impact of each pre-exposure prophylaxis strategy on the 4-year cumulative HIV incidence varied by the coverage of the universal test-and-treat strategy with lower coverage resulting in a larger impact of pre-exposure prophylaxis. Offering pre-exposure prophylaxis to serodiscordant couples resulted in the largest reductions in HIV incidence (2% reduction), and the ring-based strategy had little impact (0% reduction). The joint effect was larger than either individual effect with reductions in the 7-year incidence ranging from 4.5% to 8.8%. Targeted maximum likelihood estimation, data-adaptively adjusting for baseline covariates, substantially improved power over the unadjusted analysis, while maintaining nominal confidence interval coverage. Our simulation study suggests that nesting a pre-exposure prophylaxis study within an ongoing trial can lead to combined intervention effects greater than those of universal test-and-treat alone and can provide information about the efficacy of pre-exposure prophylaxis in the presence of high coverage of treatment for HIV+ persons.
Computational Software for Fitting Seismic Data to Epidemic-Type Aftershock Sequence Models
NASA Astrophysics Data System (ADS)
Chu, A.
2014-12-01
Modern earthquake catalogs are often analyzed using spatial-temporal point process models such as the epidemic-type aftershock sequence (ETAS) models of Ogata (1998). My work introduces software to implement two of ETAS models described in Ogata (1998). To find the Maximum-Likelihood Estimates (MLEs), my software provides estimates of the homogeneous background rate parameter and the temporal and spatial parameters that govern triggering effects by applying the Expectation-Maximization (EM) algorithm introduced in Veen and Schoenberg (2008). Despite other computer programs exist for similar data modeling purpose, using EM-algorithm has the benefits of stability and robustness (Veen and Schoenberg, 2008). Spatial shapes that are very long and narrow cause difficulties in optimization convergence and problems with flat or multi-modal log-likelihood functions encounter similar issues. My program uses a robust method to preset a parameter to overcome the non-convergence computational issue. In addition to model fitting, the software is equipped with useful tools for examining modeling fitting results, for example, visualization of estimated conditional intensity, and estimation of expected number of triggered aftershocks. A simulation generator is also given with flexible spatial shapes that may be defined by the user. This open-source software has a very simple user interface. The user may execute it on a local computer, and the program also has potential to be hosted online. Java language is used for the software's core computing part and an optional interface to the statistical package R is provided.
Deep Unfolding for Topic Models.
Chien, Jen-Tzung; Lee, Chao-Hsi
2018-02-01
Deep unfolding provides an approach to integrate the probabilistic generative models and the deterministic neural networks. Such an approach is benefited by deep representation, easy interpretation, flexible learning and stochastic modeling. This study develops the unsupervised and supervised learning of deep unfolded topic models for document representation and classification. Conventionally, the unsupervised and supervised topic models are inferred via the variational inference algorithm where the model parameters are estimated by maximizing the lower bound of logarithm of marginal likelihood using input documents without and with class labels, respectively. The representation capability or classification accuracy is constrained by the variational lower bound and the tied model parameters across inference procedure. This paper aims to relax these constraints by directly maximizing the end performance criterion and continuously untying the parameters in learning process via deep unfolding inference (DUI). The inference procedure is treated as the layer-wise learning in a deep neural network. The end performance is iteratively improved by using the estimated topic parameters according to the exponentiated updates. Deep learning of topic models is therefore implemented through a back-propagation procedure. Experimental results show the merits of DUI with increasing number of layers compared with variational inference in unsupervised as well as supervised topic models.
Conditional maximum-entropy method for selecting prior distributions in Bayesian statistics
NASA Astrophysics Data System (ADS)
Abe, Sumiyoshi
2014-11-01
The conditional maximum-entropy method (abbreviated here as C-MaxEnt) is formulated for selecting prior probability distributions in Bayesian statistics for parameter estimation. This method is inspired by a statistical-mechanical approach to systems governed by dynamics with largely separated time scales and is based on three key concepts: conjugate pairs of variables, dimensionless integration measures with coarse-graining factors and partial maximization of the joint entropy. The method enables one to calculate a prior purely from a likelihood in a simple way. It is shown, in particular, how it not only yields Jeffreys's rules but also reveals new structures hidden behind them.
Model-based tomographic reconstruction of objects containing known components.
Stayman, J Webster; Otake, Yoshito; Prince, Jerry L; Khanna, A Jay; Siewerdsen, Jeffrey H
2012-10-01
The likelihood of finding manufactured components (surgical tools, implants, etc.) within a tomographic field-of-view has been steadily increasing. One reason is the aging population and proliferation of prosthetic devices, such that more people undergoing diagnostic imaging have existing implants, particularly hip and knee implants. Another reason is that use of intraoperative imaging (e.g., cone-beam CT) for surgical guidance is increasing, wherein surgical tools and devices such as screws and plates are placed within or near to the target anatomy. When these components contain metal, the reconstructed volumes are likely to contain severe artifacts that adversely affect the image quality in tissues both near and far from the component. Because physical models of such components exist, there is a unique opportunity to integrate this knowledge into the reconstruction algorithm to reduce these artifacts. We present a model-based penalized-likelihood estimation approach that explicitly incorporates known information about component geometry and composition. The approach uses an alternating maximization method that jointly estimates the anatomy and the position and pose of each of the known components. We demonstrate that the proposed method can produce nearly artifact-free images even near the boundary of a metal implant in simulated vertebral pedicle screw reconstructions and even under conditions of substantial photon starvation. The simultaneous estimation of device pose also provides quantitative information on device placement that could be valuable to quality assurance and verification of treatment delivery.
NASA Astrophysics Data System (ADS)
Bovy Jo; Hogg, David W.; Roweis, Sam T.
2011-06-01
We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-"erge- procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional veloc! ity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.
Application and performance of an ML-EM algorithm in NEXT
NASA Astrophysics Data System (ADS)
Simón, A.; Lerche, C.; Monrabal, F.; Gómez-Cadenas, J. J.; Álvarez, V.; Azevedo, C. D. R.; Benlloch-Rodríguez, J. M.; Borges, F. I. G. M.; Botas, A.; Cárcel, S.; Carrión, J. V.; Cebrián, S.; Conde, C. A. N.; Díaz, J.; Diesburg, M.; Escada, J.; Esteve, R.; Felkai, R.; Fernandes, L. M. P.; Ferrario, P.; Ferreira, A. L.; Freitas, E. D. C.; Goldschmidt, A.; González-Díaz, D.; Gutiérrez, R. M.; Hauptman, J.; Henriques, C. A. O.; Hernandez, A. I.; Hernando Morata, J. A.; Herrero, V.; Jones, B. J. P.; Labarga, L.; Laing, A.; Lebrun, P.; Liubarsky, I.; López-March, N.; Losada, M.; Martín-Albo, J.; Martínez-Lema, G.; Martínez, A.; McDonald, A. D.; Monteiro, C. M. B.; Mora, F. J.; Moutinho, L. M.; Muñoz Vidal, J.; Musti, M.; Nebot-Guinot, M.; Novella, P.; Nygren, D. R.; Palmeiro, B.; Para, A.; Pérez, J.; Querol, M.; Renner, J.; Ripoll, L.; Rodríguez, J.; Rogers, L.; Santos, F. P.; dos Santos, J. M. F.; Sofka, C.; Sorel, M.; Stiegler, T.; Toledo, J. F.; Torrent, J.; Tsamalaidze, Z.; Veloso, J. F. C. A.; Webb, R.; White, J. T.; Yahlali, N.
2017-08-01
The goal of the NEXT experiment is the observation of neutrinoless double beta decay in 136Xe using a gaseous xenon TPC with electroluminescent amplification and specialized photodetector arrays for calorimetry and tracking. The NEXT Collaboration is exploring a number of reconstruction algorithms to exploit the full potential of the detector. This paper describes one of them: the Maximum Likelihood Expectation Maximization (ML-EM) method, a generic iterative algorithm to find maximum-likelihood estimates of parameters that has been applied to solve many different types of complex inverse problems. In particular, we discuss a bi-dimensional version of the method in which the photosensor signals integrated over time are used to reconstruct a transverse projection of the event. First results show that, when applied to detector simulation data, the algorithm achieves nearly optimal energy resolution (better than 0.5% FWHM at the Q value of 136Xe) for events distributed over the full active volume of the TPC.
Liu, Xiang; Peng, Yingwei; Tu, Dongsheng; Liang, Hua
2012-10-30
Survival data with a sizable cure fraction are commonly encountered in cancer research. The semiparametric proportional hazards cure model has been recently used to analyze such data. As seen in the analysis of data from a breast cancer study, a variable selection approach is needed to identify important factors in predicting the cure status and risk of breast cancer recurrence. However, no specific variable selection method for the cure model is available. In this paper, we present a variable selection approach with penalized likelihood for the cure model. The estimation can be implemented easily by combining the computational methods for penalized logistic regression and the penalized Cox proportional hazards models with the expectation-maximization algorithm. We illustrate the proposed approach on data from a breast cancer study. We conducted Monte Carlo simulations to evaluate the performance of the proposed method. We used and compared different penalty functions in the simulation studies. Copyright © 2012 John Wiley & Sons, Ltd.
Estimation of mating system parameters in plant populations using marker loci with null alleles.
Ross, H A
1986-06-01
An Expectation-Maximization (EM)-algorithm procedure is presented that extends Cheliak et al. (1983) method of maximum-likelihood estimation of mating system parameters of mixed mating system models. The extension permits the estimation of the rate of self-fertilization (s) and allele frequencies (Pi) at loci in outcrossing pollen, at marker loci having recessive null alleles. The algorithm makes use of maternal and filial genotypic arrays obtained by the electrophoretic analysis of cohorts of progeny. The genotypes of maternal plants must be known. Explicit equations are given for cases when the genotype of the maternal gamete inherited by a seed can (gymnosperms) or cannot (angiosperms) be determined. The procedure can accommodate any number of codominant alleles, but only one recessive null allele at each locus. An example, using actual data from Pinus banksiana, is presented to illustrate the application of this EM algorithm to the estimation of mating system parameters using marker loci having both codominant and recessive alleles.
Chakraborty, Arindom
2016-12-01
A common objective in longitudinal studies is to characterize the relationship between a longitudinal response process and a time-to-event data. Ordinal nature of the response and possible missing information on covariates add complications to the joint model. In such circumstances, some influential observations often present in the data may upset the analysis. In this paper, a joint model based on ordinal partial mixed model and an accelerated failure time model is used, to account for the repeated ordered response and time-to-event data, respectively. Here, we propose an influence function-based robust estimation method. Monte Carlo expectation maximization method-based algorithm is used for parameter estimation. A detailed simulation study has been done to evaluate the performance of the proposed method. As an application, a data on muscular dystrophy among children is used. Robust estimates are then compared with classical maximum likelihood estimates. © The Author(s) 2014.
Rigby, Robert A; Stasinopoulos, D Mikis
2004-10-15
The Box-Cox power exponential (BCPE) distribution, developed in this paper, provides a model for a dependent variable Y exhibiting both skewness and kurtosis (leptokurtosis or platykurtosis). The distribution is defined by a power transformation Y(nu) having a shifted and scaled (truncated) standard power exponential distribution with parameter tau. The distribution has four parameters and is denoted BCPE (mu,sigma,nu,tau). The parameters, mu, sigma, nu and tau, may be interpreted as relating to location (median), scale (approximate coefficient of variation), skewness (transformation to symmetry) and kurtosis (power exponential parameter), respectively. Smooth centile curves are obtained by modelling each of the four parameters of the distribution as a smooth non-parametric function of an explanatory variable. A Fisher scoring algorithm is used to fit the non-parametric model by maximizing a penalized likelihood. The first and expected second and cross derivatives of the likelihood, with respect to mu, sigma, nu and tau, required for the algorithm, are provided. The centiles of the BCPE distribution are easy to calculate, so it is highly suited to centile estimation. This application of the BCPE distribution to smooth centile estimation provides a generalization of the LMS method of the centile estimation to data exhibiting kurtosis (as well as skewness) different from that of a normal distribution and is named here the LMSP method of centile estimation. The LMSP method of centile estimation is applied to modelling the body mass index of Dutch males against age. 2004 John Wiley & Sons, Ltd.
Determining the accuracy of maximum likelihood parameter estimates with colored residuals
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; Klein, Vladislav
1994-01-01
An important part of building high fidelity mathematical models based on measured data is calculating the accuracy associated with statistical estimates of the model parameters. Indeed, without some idea of the accuracy of parameter estimates, the estimates themselves have limited value. In this work, an expression based on theoretical analysis was developed to properly compute parameter accuracy measures for maximum likelihood estimates with colored residuals. This result is important because experience from the analysis of measured data reveals that the residuals from maximum likelihood estimation are almost always colored. The calculations involved can be appended to conventional maximum likelihood estimation algorithms. Simulated data runs were used to show that the parameter accuracy measures computed with this technique accurately reflect the quality of the parameter estimates from maximum likelihood estimation without the need for analysis of the output residuals in the frequency domain or heuristically determined multiplication factors. The result is general, although the application studied here is maximum likelihood estimation of aerodynamic model parameters from flight test data.
On the existence of maximum likelihood estimates for presence-only data
Hefley, Trevor J.; Hooten, Mevin B.
2015-01-01
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence-only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
NASA Technical Reports Server (NTRS)
1979-01-01
The computer program Linear SCIDNT which evaluates rotorcraft stability and control coefficients from flight or wind tunnel test data is described. It implements the maximum likelihood method to maximize the likelihood function of the parameters based on measured input/output time histories. Linear SCIDNT may be applied to systems modeled by linear constant-coefficient differential equations. This restriction in scope allows the application of several analytical results which simplify the computation and improve its efficiency over the general nonlinear case.
Estimating Function Approaches for Spatial Point Processes
NASA Astrophysics Data System (ADS)
Deng, Chong
Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
Retinal artery-vein classification via topology estimation
Estrada, Rolando; Allingham, Michael J.; Mettu, Priyatham S.; Cousins, Scott W.; Tomasi, Carlo; Farsiu, Sina
2015-01-01
We propose a novel, graph-theoretic framework for distinguishing arteries from veins in a fundus image. We make use of the underlying vessel topology to better classify small and midsized vessels. We extend our previously proposed tree topology estimation framework by incorporating expert, domain-specific features to construct a simple, yet powerful global likelihood model. We efficiently maximize this model by iteratively exploring the space of possible solutions consistent with the projected vessels. We tested our method on four retinal datasets and achieved classification accuracies of 91.0%, 93.5%, 91.7%, and 90.9%, outperforming existing methods. Our results show the effectiveness of our approach, which is capable of analyzing the entire vasculature, including peripheral vessels, in wide field-of-view fundus photographs. This topology-based method is a potentially important tool for diagnosing diseases with retinal vascular manifestation. PMID:26068204
Finite mixture model: A maximum likelihood estimation approach on time series data
NASA Astrophysics Data System (ADS)
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Remontet, L; Bossard, N; Belot, A; Estève, J
2007-05-10
Relative survival provides a measure of the proportion of patients dying from the disease under study without requiring the knowledge of the cause of death. We propose an overall strategy based on regression models to estimate the relative survival and model the effects of potential prognostic factors. The baseline hazard was modelled until 10 years follow-up using parametric continuous functions. Six models including cubic regression splines were considered and the Akaike Information Criterion was used to select the final model. This approach yielded smooth and reliable estimates of mortality hazard and allowed us to deal with sparse data taking into account all the available information. Splines were also used to model simultaneously non-linear effects of continuous covariates and time-dependent hazard ratios. This led to a graphical representation of the hazard ratio that can be useful for clinical interpretation. Estimates of these models were obtained by likelihood maximization. We showed that these estimates could be also obtained using standard algorithms for Poisson regression. Copyright 2006 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate were considered. These equations suggest certain successive approximations iterative procedures for obtaining maximum likelihood estimates. The procedures, which are generalized steepest ascent (deflected gradient) procedures, contain those of Hosmer as a special case.
Discovering network behind infectious disease outbreak
NASA Astrophysics Data System (ADS)
Maeno, Yoshiharu
2010-11-01
Stochasticity and spatial heterogeneity are of great interest recently in studying the spread of an infectious disease. The presented method solves an inverse problem to discover the effectively decisive topology of a heterogeneous network and reveal the transmission parameters which govern the stochastic spreads over the network from a dataset on an infectious disease outbreak in the early growth phase. Populations in a combination of epidemiological compartment models and a meta-population network model are described by stochastic differential equations. Probability density functions are derived from the equations and used for the maximal likelihood estimation of the topology and parameters. The method is tested with computationally synthesized datasets and the WHO dataset on the SARS outbreak.
CARHTA GENE: multipopulation integrated genetic and radiation hybrid mapping.
de Givry, Simon; Bouchez, Martin; Chabrier, Patrick; Milan, Denis; Schiex, Thomas
2005-04-15
CAR(H)(T)A GENE: is an integrated genetic and radiation hybrid (RH) mapping tool which can deal with multiple populations, including mixtures of genetic and RH data. CAR(H)(T)A GENE: performs multipoint maximum likelihood estimations with accelerated expectation-maximization algorithms for some pedigrees and has sophisticated algorithms for marker ordering. Dedicated heuristics for framework mapping are also included. CAR(H)(T)A GENE: can be used as a C++ library, through a shell command and a graphical interface. The XML output for companion tools is integrated. The program is available free of charge from www.inra.fr/bia/T/CarthaGene for Linux, Windows and Solaris machines (with Open Source). tschiex@toulouse.inra.fr.
Joint reconstruction of activity and attenuation in Time-of-Flight PET: A Quantitative Analysis.
Rezaei, Ahmadreza; Deroose, Christophe M; Vahle, Thomas; Boada, Fernando; Nuyts, Johan
2018-03-01
Joint activity and attenuation reconstruction methods from time of flight (TOF) positron emission tomography (PET) data provide an effective solution to attenuation correction when no (or incomplete/inaccurate) information on the attenuation is available. One of the main barriers limiting their use in clinical practice is the lack of validation of these methods on a relatively large patient database. In this contribution, we aim at validating the activity reconstructions of the maximum likelihood activity reconstruction and attenuation registration (MLRR) algorithm on a whole-body patient data set. Furthermore, a partial validation (since the scale problem of the algorithm is avoided for now) of the maximum likelihood activity and attenuation reconstruction (MLAA) algorithm is also provided. We present a quantitative comparison of the joint reconstructions to the current clinical gold-standard maximum likelihood expectation maximization (MLEM) reconstruction with CT-based attenuation correction. Methods: The whole-body TOF-PET emission data of each patient data set is processed as a whole to reconstruct an activity volume covering all the acquired bed positions, which helps to reduce the problem of a scale per bed position in MLAA to a global scale for the entire activity volume. Three reconstruction algorithms are used: MLEM, MLRR and MLAA. A maximum likelihood (ML) scaling of the single scatter simulation (SSS) estimate to the emission data is used for scatter correction. The reconstruction results are then analyzed in different regions of interest. Results: The joint reconstructions of the whole-body patient data set provide better quantification in case of PET and CT misalignments caused by patient and organ motion. Our quantitative analysis shows a difference of -4.2% (±2.3%) and -7.5% (±4.6%) between the joint reconstructions of MLRR and MLAA compared to MLEM, averaged over all regions of interest, respectively. Conclusion: Joint activity and attenuation estimation methods provide a useful means to estimate the tracer distribution in cases where CT-based attenuation images are subject to misalignments or are not available. With an accurate estimate of the scatter contribution in the emission measurements, the joint TOF-PET reconstructions are within clinical acceptable accuracy. Copyright © 2018 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
NASA Astrophysics Data System (ADS)
Wang, Xun; Quost, Benjamin; Chazot, Jean-Daniel; Antoni, Jérôme
2016-01-01
This paper considers the problem of identifying multiple sound sources from acoustical measurements obtained by an array of microphones. The problem is solved via maximum likelihood. In particular, an expectation-maximization (EM) approach is used to estimate the sound source locations and strengths, the pressure measured by a microphone being interpreted as a mixture of latent signals emitted by the sources. This work also considers two kinds of uncertainties pervading the sound propagation and measurement process: uncertain microphone locations and uncertain wavenumber. These uncertainties are transposed to the data in the belief functions framework. Then, the source locations and strengths can be estimated using a variant of the EM algorithm, known as the Evidential EM (E2M) algorithm. Eventually, both simulation and real experiments are shown to illustrate the advantage of using the EM in the case without uncertainty and the E2M in the case of uncertain measurement.
Zhou, Xiaofan; Shen, Xing-Xing; Hittinger, Chris Todd
2018-01-01
Abstract The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets with hundreds to thousands of genes and up to 200 taxa with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation-based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the programs’ relative performance. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses. PMID:29177474
Impacts of Maximizing Tendencies on Experience-Based Decisions.
Rim, Hye Bin
2017-06-01
Previous research on risky decisions has suggested that people tend to make different choices depending on whether they acquire the information from personally repeated experiences or from statistical summary descriptions. This phenomenon, called as a description-experience gap, was expected to be moderated by the individual difference in maximizing tendencies, a desire towards maximizing decisional outcome. Specifically, it was hypothesized that maximizers' willingness to engage in extensive information searching would lead maximizers to make experience-based decisions as payoff distributions were given explicitly. A total of 262 participants completed four decision problems. Results showed that maximizers, compared to non-maximizers, drew more samples before making a choice but reported lower confidence levels on both the accuracy of knowledge gained from experiences and the likelihood of satisfactory outcomes. Additionally, maximizers exhibited smaller description-experience gaps than non-maximizers as expected. The implications of the findings and unanswered questions for future research were discussed.
Convergence optimization of parametric MLEM reconstruction for estimation of Patlak plot parameters.
Angelis, Georgios I; Thielemans, Kris; Tziortzi, Andri C; Turkheimer, Federico E; Tsoumpas, Charalampos
2011-07-01
In dynamic positron emission tomography data many researchers have attempted to exploit kinetic models within reconstruction such that parametric images are estimated directly from measurements. This work studies a direct parametric maximum likelihood expectation maximization algorithm applied to [(18)F]DOPA data using reference-tissue input function. We use a modified version for direct reconstruction with a gradually descending scheme of subsets (i.e. 18-6-1) initialized with the FBP parametric image for faster convergence and higher accuracy. The results compared with analytic reconstructions show quantitative robustness (i.e. minimal bias) and clinical reproducibility within six human acquisitions in the region of clinical interest. Bland-Altman plots for all the studies showed sufficient quantitative agreement between the direct reconstructed parametric maps and the indirect FBP (--0.035x+0.48E--5). Copyright © 2011 Elsevier Ltd. All rights reserved.
Wang, Chaolong; Schroeder, Kari B.; Rosenberg, Noah A.
2012-01-01
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy–Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology. PMID:22851645
Jeon, Jihyoun; Hsu, Li; Gorfine, Malka
2012-07-01
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Generalized expectation-maximization segmentation of brain MR images
NASA Astrophysics Data System (ADS)
Devalkeneer, Arnaud A.; Robe, Pierre A.; Verly, Jacques G.; Phillips, Christophe L. M.
2006-03-01
Manual segmentation of medical images is unpractical because it is time consuming, not reproducible, and prone to human error. It is also very difficult to take into account the 3D nature of the images. Thus, semi- or fully-automatic methods are of great interest. Current segmentation algorithms based on an Expectation- Maximization (EM) procedure present some limitations. The algorithm by Ashburner et al., 2005, does not allow multichannel inputs, e.g. two MR images of different contrast, and does not use spatial constraints between adjacent voxels, e.g. Markov random field (MRF) constraints. The solution of Van Leemput et al., 1999, employs a simplified model (mixture coefficients are not estimated and only one Gaussian is used by tissue class, with three for the image background). We have thus implemented an algorithm that combines the features of these two approaches: multichannel inputs, intensity bias correction, multi-Gaussian histogram model, and Markov random field (MRF) constraints. Our proposed method classifies tissues in three iterative main stages by way of a Generalized-EM (GEM) algorithm: (1) estimation of the Gaussian parameters modeling the histogram of the images, (2) correction of image intensity non-uniformity, and (3) modification of prior classification knowledge by MRF techniques. The goal of the GEM algorithm is to maximize the log-likelihood across the classes and voxels. Our segmentation algorithm was validated on synthetic data (with the Dice metric criterion) and real data (by a neurosurgeon) and compared to the original algorithms by Ashburner et al. and Van Leemput et al. Our combined approach leads to more robust and accurate segmentation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yan; Mohanty, Soumya D.; Jenet, Fredrick A., E-mail: ywang12@hust.edu.cn
2015-12-20
Supermassive black hole binaries are one of the primary targets of gravitational wave (GW) searches using pulsar timing arrays (PTAs). GW signals from such systems are well represented by parameterized models, allowing the standard Generalized Likelihood Ratio Test (GLRT) to be used for their detection and estimation. However, there is a dichotomy in how the GLRT can be implemented for PTAs: there are two possible ways in which one can split the set of signal parameters for semi-analytical and numerical extremization. The straightforward extension of the method used for continuous signals in ground-based GW searches, where the so-called pulsar phasemore » parameters are maximized numerically, was addressed in an earlier paper. In this paper, we report the first study of the performance of the second approach where the pulsar phases are maximized semi-analytically. This approach is scalable since the number of parameters left over for numerical optimization does not depend on the size of the PTA. Our results show that for the same array size (9 pulsars), the new method performs somewhat worse in parameter estimation, but not in detection, than the previous method where the pulsar phases were maximized numerically. The origin of the performance discrepancy is likely to be in the ill-posedness that is intrinsic to any network analysis method. However, the scalability of the new method allows the ill-posedness to be mitigated by simply adding more pulsars to the array. This is shown explicitly by taking a larger array of pulsars.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
The purpose of the computer program is to generate system matrices that model data acquisition process in dynamic single photon emission computed tomography (SPECT). The application is for the reconstruction of dynamic data from projection measurements that provide the time evolution of activity uptake and wash out in an organ of interest. The measurement of the time activity in the blood and organ tissue provide time-activity curves (TACs) that are used to estimate kinetic parameters. The program provides a correct model of the in vivo spatial and temporal distribution of radioactive in organs. The model accounts for the attenuation ofmore » the internal emitting radioactivity, it accounts for the vary point response of the collimators, and correctly models the time variation of the activity in the organs. One important application where the software is being used in a measuring the arterial input function (AIF) in a dynamic SPECT study where the data are acquired from a slow camera rotation. Measurement of the arterial input function (AIF) is essential to deriving quantitative estimates of regional myocardial blood flow using kinetic models. A study was performed to evaluate whether a slowly rotating SPECT system could provide accurate AIF's for myocardial perfusion imaging (MPI). Methods: Dynamic cardiac SPECT was first performed in human subjects at rest using a Phillips Precedence SPECT/CT scanner. Dynamic measurements of Tc-99m-tetrofosmin in the myocardium were obtained using an infusion time of 2 minutes. Blood input, myocardium tissue and liver TACs were estimated using spatiotemporal splines. These were fit to a one-compartment perfusion model to obtain wash-in rate parameters K1. Results: The spatiotemporal 4D ML-EM reconstructions gave more accurate reconstructions that did standard frame-by-frame 3D ML-EM reconstructions. From additional computer simulations and phantom studies, it was determined that a 1 minute infusion with a SPECT system rotation speed providing 180 degrees of projection data every 54s can produce measurements of blood pool and myocardial TACs. This has important application in the circulation of coronary flow reserve using rest/stress dynamic cardiac SPECT. They system matrices are used in maximum likelihood and maximum a posterior formulations in estimation theory where through iterative algorithms (conjugate gradient, expectation maximization, or maximum a posteriori probability algorithms) the solution is determined that maximizes a likelihood or a posteriori probability function.« less
Approximated mutual information training for speech recognition using myoelectric signals.
Guo, Hua J; Chan, A D C
2006-01-01
A new training algorithm called the approximated maximum mutual information (AMMI) is proposed to improve the accuracy of myoelectric speech recognition using hidden Markov models (HMMs). Previous studies have demonstrated that automatic speech recognition can be performed using myoelectric signals from articulatory muscles of the face. Classification of facial myoelectric signals can be performed using HMMs that are trained using the maximum likelihood (ML) algorithm; however, this algorithm maximizes the likelihood of the observations in the training sequence, which is not directly associated with optimal classification accuracy. The AMMI training algorithm attempts to maximize the mutual information, thereby training the HMMs to optimize their parameters for discrimination. Our results show that AMMI training consistently reduces the error rates compared to these by the ML training, increasing the accuracy by approximately 3% on average.
ERIC Educational Resources Information Center
Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S.
2016-01-01
The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.
ERIC Educational Resources Information Center
McKinley, Robert L.; Reckase, Mark D.
A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gopich, Irina V.
2015-01-21
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when themore » FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated.« less
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
Gopich, Irina V.
2015-01-01
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when the FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated. PMID:25612692
Modeling Adversaries in Counterterrorism Decisions Using Prospect Theory.
Merrick, Jason R W; Leclerc, Philip
2016-04-01
Counterterrorism decisions have been an intense area of research in recent years. Both decision analysis and game theory have been used to model such decisions, and more recently approaches have been developed that combine the techniques of the two disciplines. However, each of these approaches assumes that the attacker is maximizing its utility. Experimental research shows that human beings do not make decisions by maximizing expected utility without aid, but instead deviate in specific ways such as loss aversion or likelihood insensitivity. In this article, we modify existing methods for counterterrorism decisions. We keep expected utility as the defender's paradigm to seek for the rational decision, but we use prospect theory to solve for the attacker's decision to descriptively model the attacker's loss aversion and likelihood insensitivity. We study the effects of this approach in a critical decision, whether to screen containers entering the United States for radioactive materials. We find that the defender's optimal decision is sensitive to the attacker's levels of loss aversion and likelihood insensitivity, meaning that understanding such descriptive decision effects is important in making such decisions. © 2014 Society for Risk Analysis.
Yang, Li; Wang, Guobao; Qi, Jinyi
2016-04-01
Detecting cancerous lesions is a major clinical application of emission tomography. In a previous work, we studied penalized maximum-likelihood (PML) image reconstruction for lesion detection in static PET. Here we extend our theoretical analysis of static PET reconstruction to dynamic PET. We study both the conventional indirect reconstruction and direct reconstruction for Patlak parametric image estimation. In indirect reconstruction, Patlak parametric images are generated by first reconstructing a sequence of dynamic PET images, and then performing Patlak analysis on the time activity curves (TACs) pixel-by-pixel. In direct reconstruction, Patlak parametric images are estimated directly from raw sinogram data by incorporating the Patlak model into the image reconstruction procedure. PML reconstruction is used in both the indirect and direct reconstruction methods. We use a channelized Hotelling observer (CHO) to assess lesion detectability in Patlak parametric images. Simplified expressions for evaluating the lesion detectability have been derived and applied to the selection of the regularization parameter value to maximize detection performance. The proposed method is validated using computer-based Monte Carlo simulations. Good agreements between the theoretical predictions and the Monte Carlo results are observed. Both theoretical predictions and Monte Carlo simulation results show the benefit of the indirect and direct methods under optimized regularization parameters in dynamic PET reconstruction for lesion detection, when compared with the conventional static PET reconstruction.
Robust statistical reconstruction for charged particle tomography
Schultz, Larry Joe; Klimenko, Alexei Vasilievich; Fraser, Andrew Mcleod; Morris, Christopher; Orum, John Christopher; Borozdin, Konstantin N; Sossong, Michael James; Hengartner, Nicolas W
2013-10-08
Systems and methods for charged particle detection including statistical reconstruction of object volume scattering density profiles from charged particle tomographic data to determine the probability distribution of charged particle scattering using a statistical multiple scattering model and determine a substantially maximum likelihood estimate of object volume scattering density using expectation maximization (ML/EM) algorithm to reconstruct the object volume scattering density. The presence of and/or type of object occupying the volume of interest can be identified from the reconstructed volume scattering density profile. The charged particle tomographic data can be cosmic ray muon tomographic data from a muon tracker for scanning packages, containers, vehicles or cargo. The method can be implemented using a computer program which is executable on a computer.
Estimation for general birth-death processes
Crawford, Forrest W.; Minin, Vladimir N.; Suchard, Marc A.
2013-01-01
Birth-death processes (BDPs) are continuous-time Markov chains that track the number of “particles” in a system over time. While widely used in population biology, genetics and ecology, statistical inference of the instantaneous particle birth and death rates remains largely limited to restrictive linear BDPs in which per-particle birth and death rates are constant. Researchers often observe the number of particles at discrete times, necessitating data augmentation procedures such as expectation-maximization (EM) to find maximum likelihood estimates. For BDPs on finite state-spaces, there are powerful matrix methods for computing the conditional expectations needed for the E-step of the EM algorithm. For BDPs on infinite state-spaces, closed-form solutions for the E-step are available for some linear models, but most previous work has resorted to time-consuming simulation. Remarkably, we show that the E-step conditional expectations can be expressed as convolutions of computable transition probabilities for any general BDP with arbitrary rates. This important observation, along with a convenient continued fraction representation of the Laplace transforms of the transition probabilities, allows for novel and efficient computation of the conditional expectations for all BDPs, eliminating the need for truncation of the state-space or costly simulation. We use this insight to derive EM algorithms that yield maximum likelihood estimation for general BDPs characterized by various rate models, including generalized linear models. We show that our Laplace convolution technique outperforms competing methods when they are available and demonstrate a technique to accelerate EM algorithm convergence. We validate our approach using synthetic data and then apply our methods to cancer cell growth and estimation of mutation parameters in microsatellite evolution. PMID:25328261
Estimation for general birth-death processes.
Crawford, Forrest W; Minin, Vladimir N; Suchard, Marc A
2014-04-01
Birth-death processes (BDPs) are continuous-time Markov chains that track the number of "particles" in a system over time. While widely used in population biology, genetics and ecology, statistical inference of the instantaneous particle birth and death rates remains largely limited to restrictive linear BDPs in which per-particle birth and death rates are constant. Researchers often observe the number of particles at discrete times, necessitating data augmentation procedures such as expectation-maximization (EM) to find maximum likelihood estimates. For BDPs on finite state-spaces, there are powerful matrix methods for computing the conditional expectations needed for the E-step of the EM algorithm. For BDPs on infinite state-spaces, closed-form solutions for the E-step are available for some linear models, but most previous work has resorted to time-consuming simulation. Remarkably, we show that the E-step conditional expectations can be expressed as convolutions of computable transition probabilities for any general BDP with arbitrary rates. This important observation, along with a convenient continued fraction representation of the Laplace transforms of the transition probabilities, allows for novel and efficient computation of the conditional expectations for all BDPs, eliminating the need for truncation of the state-space or costly simulation. We use this insight to derive EM algorithms that yield maximum likelihood estimation for general BDPs characterized by various rate models, including generalized linear models. We show that our Laplace convolution technique outperforms competing methods when they are available and demonstrate a technique to accelerate EM algorithm convergence. We validate our approach using synthetic data and then apply our methods to cancer cell growth and estimation of mutation parameters in microsatellite evolution.
The recursive maximum likelihood proportion estimator: User's guide and test results
NASA Technical Reports Server (NTRS)
Vanrooy, D. L.
1976-01-01
Implementation of the recursive maximum likelihood proportion estimator is described. A user's guide to programs as they currently exist on the IBM 360/67 at LARS, Purdue is included, and test results on LANDSAT data are described. On Hill County data, the algorithm yields results comparable to the standard maximum likelihood proportion estimator.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
Arribas-Gil, Ana; De la Cruz, Rolando; Lebarbier, Emilie; Meza, Cristian
2015-06-01
We propose a classification method for longitudinal data. The Bayes classifier is classically used to determine a classification rule where the underlying density in each class needs to be well modeled and estimated. This work is motivated by a real dataset of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. The proposed model, which is a semiparametric linear mixed-effects model (SLMM), is a particular case of the semiparametric nonlinear mixed-effects class of models (SNMM) in which finite dimensional (fixed effects and variance components) and infinite dimensional (an unknown function) parameters have to be estimated. In SNMM's maximum likelihood estimation is performed iteratively alternating parametric and nonparametric procedures. However, if one can make the assumption that the random effects and the unknown function interact in a linear way, more efficient estimation methods can be used. Our contribution is the proposal of a unified estimation procedure based on a penalized EM-type algorithm. The Expectation and Maximization steps are explicit. In this latter step, the unknown function is estimated in a nonparametric fashion using a lasso-type procedure. A simulation study and an application on real data are performed. © 2015, The International Biometric Society.
Chen, Rui; Hyrien, Ollivier
2011-01-01
This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
Huang, Chiung-Yu; Qin, Jing
2013-01-01
The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15
ERIC Educational Resources Information Center
Zhang, Jinming
2005-01-01
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
A long-term earthquake rate model for the central and eastern United States from smoothed seismicity
Moschetti, Morgan P.
2015-01-01
I present a long-term earthquake rate model for the central and eastern United States from adaptive smoothed seismicity. By employing pseudoprospective likelihood testing (L-test), I examined the effects of fixed and adaptive smoothing methods and the effects of catalog duration and composition on the ability of the models to forecast the spatial distribution of recent earthquakes. To stabilize the adaptive smoothing method for regions of low seismicity, I introduced minor modifications to the way that the adaptive smoothing distances are calculated. Across all smoothed seismicity models, the use of adaptive smoothing and the use of earthquakes from the recent part of the catalog optimizes the likelihood for tests with M≥2.7 and M≥4.0 earthquake catalogs. The smoothed seismicity models optimized by likelihood testing with M≥2.7 catalogs also produce the highest likelihood values for M≥4.0 likelihood testing, thus substantiating the hypothesis that the locations of moderate-size earthquakes can be forecast by the locations of smaller earthquakes. The likelihood test does not, however, maximize the fraction of earthquakes that are better forecast than a seismicity rate model with uniform rates in all cells. In this regard, fixed smoothing models perform better than adaptive smoothing models. The preferred model of this study is the adaptive smoothed seismicity model, based on its ability to maximize the joint likelihood of predicting the locations of recent small-to-moderate-size earthquakes across eastern North America. The preferred rate model delineates 12 regions where the annual rate of M≥5 earthquakes exceeds 2×10−3. Although these seismic regions have been previously recognized, the preferred forecasts are more spatially concentrated than the rates from fixed smoothed seismicity models, with rate increases of up to a factor of 10 near clusters of high seismic activity.
An alternative method to measure the likelihood of a financial crisis in an emerging market
NASA Astrophysics Data System (ADS)
Özlale, Ümit; Metin-Özcan, Kıvılcım
2007-07-01
This paper utilizes an early warning system in order to measure the likelihood of a financial crisis in an emerging market economy. We introduce a methodology, where we can both obtain a likelihood series and analyze the time-varying effects of several macroeconomic variables on this likelihood. Since the issue is analyzed in a non-linear state space framework, the extended Kalman filter emerges as the optimal estimation algorithm. Taking the Turkish economy as our laboratory, the results indicate that both the derived likelihood measure and the estimated time-varying parameters are meaningful and can successfully explain the path that the Turkish economy had followed between 2000 and 2006. The estimated parameters also suggest that overvalued domestic currency, current account deficit and the increase in the default risk increase the likelihood of having an economic crisis in the economy. Overall, the findings in this paper suggest that the estimation methodology introduced in this paper can also be applied to other emerging market economies as well.
Maximum likelihood estimation of signal-to-noise ratio and combiner weight
NASA Technical Reports Server (NTRS)
Kalson, S.; Dolinar, S. J.
1986-01-01
An algorithm for estimating signal to noise ratio and combiner weight parameters for a discrete time series is presented. The algorithm is based upon the joint maximum likelihood estimate of the signal and noise power. The discrete-time series are the sufficient statistics obtained after matched filtering of a biphase modulated signal in additive white Gaussian noise, before maximum likelihood decoding is performed.
Changren Weng; Thomas L. Kubisiak; C. Dana Nelson; James P. Geaghan; Michael Stine
1999-01-01
Single marker regression and single marker maximum likelihood estimation were tied to detect quantitative trait loci (QTLs) controlling the early height growth of longleaf pine and slash pine using a ((longleaf pine x slash pine) x slash pine) BC, population consisting of 83 progeny. Maximum likelihood estimation was found to be more power than regression and could...
Wang, Zhu; Shuangge, Ma; Wang, Ching-Yun
2017-01-01
In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero-inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a penalty including the least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP). An EM (expectation-maximization) algorithm is proposed for estimating the model parameters and conducting variable selection simultaneously. This algorithm consists of estimating penalized weighted negative binomial models and penalized logistic models via the coordinated descent algorithm. Furthermore, statistical properties including the standard error formulae are provided. A simulation study shows that the new algorithm not only has more accurate or at least comparable estimation, also is more robust than the traditional stepwise variable selection. The proposed methods are applied to analyze the health care demand in Germany using an open-source R package mpath. PMID:26059498
Investigating the Impact of Uncertainty about Item Parameters on Ability Estimation
ERIC Educational Resources Information Center
Zhang, Jinming; Xie, Minge; Song, Xiaolan; Lu, Ting
2011-01-01
Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimator (WLE) of an examinee's ability are derived while item parameter estimators are treated as covariates measured with error. The asymptotic formulae present the amount of bias of the ability estimators due to the uncertainty of item parameter estimators.…
Maximum likelihood estimation of finite mixture model for economic data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Maximum Likelihood Estimation with Emphasis on Aircraft Flight Data
NASA Technical Reports Server (NTRS)
Iliff, K. W.; Maine, R. E.
1985-01-01
Accurate modeling of flexible space structures is an important field that is currently under investigation. Parameter estimation, using methods such as maximum likelihood, is one of the ways that the model can be improved. The maximum likelihood estimator has been used to extract stability and control derivatives from flight data for many years. Most of the literature on aircraft estimation concentrates on new developments and applications, assuming familiarity with basic estimation concepts. Some of these basic concepts are presented. The maximum likelihood estimator and the aircraft equations of motion that the estimator uses are briefly discussed. The basic concepts of minimization and estimation are examined for a simple computed aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to help illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Specific examples of estimation of structural dynamics are included. Some of the major conclusions for the computed example are also developed for the analysis of flight data.
Assessment of the Maximal Split-Half Coefficient to Estimate Reliability
ERIC Educational Resources Information Center
Thompson, Barry L.; Green, Samuel B.; Yang, Yanyun
2010-01-01
The maximal split-half coefficient is computed by calculating all possible split-half reliability estimates for a scale and then choosing the maximal value as the reliability estimate. Osburn compared the maximal split-half coefficient with 10 other internal consistency estimates of reliability and concluded that it yielded the most consistently…
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
New results and insights concerning a previously published iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions were discussed. It was shown that the procedure converges locally to the consistent maximum likelihood estimate as long as a specified parameter is bounded between two limits. Bound values were given to yield optimal local convergence.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
Loh, Jane; Kennedy, Mary Clare; Wood, Evan; Kerr, Thomas; Marshall, Brandon; Parashar, Surita; Montaner, Julio; Milloy, M-J
2016-11-01
Homelessness is common among people who use drugs (PWUD) and, for those living with HIV/AIDS, an important contributor to sub-optimal HIV treatment outcomes. This study aims to investigate the relationship between the duration of homelessness and the likelihood of plasma HIV-1 RNA viral load (VL) non-detectability among a cohort of HIV-positive PWUD. We used data from the ACCESS study, a long-running prospective cohort study of HIV-positive PWUD linked to comprehensive HIV clinical records including systematic plasma HIV-1 RNA VL monitoring. We estimated the longitudinal relationship between the duration of homelessness and the likelihood of exhibiting a non-detectable VL (i.e., <500 copies/mL plasma) using generalized linear mixed-effects modelling. Between May 1996 and June 2014, 922 highly active antiretroviral therapy-exposed participants were recruited and contributed 8188 observations. Of these, 4800 (59%) were characterized by non-detectable VL. Participants reported they were homeless in 910 (11%) interviews (median: six months, interquartile range: 6-12 months). A longer duration of homelessness was associated with lower odds of VL non-detectability (adjusted odds ratio = 0.71 per six-month period of homelessness, 95% confidence interval: 0.60-0.83) after adjustment for age, ancestry, drug use patterns, engagement in addiction treatment, and other potential confounders. Longer durations of episodes of homelessness in this cohort of HIV-positive illicit drug users were associated with a lower likelihood of plasma VL non-detectability. Our findings suggest that interventions that seek to promptly house homeless individuals, such as Housing First approaches, might assist in maximizing the clinical and public health benefits of antiretroviral therapy among people living with HIV/AIDS.
Multistage degradation modeling for BLDC motor based on Wiener process
NASA Astrophysics Data System (ADS)
Yuan, Qingyang; Li, Xiaogang; Gao, Yuankai
2018-05-01
Brushless DC motors are widely used, and their working temperatures, regarding as degradation processes, are nonlinear and multistage. It is necessary to establish a nonlinear degradation model. In this research, our study was based on accelerated degradation data of motors, which are their working temperatures. A multistage Wiener model was established by using the transition function to modify linear model. The normal weighted average filter (Gauss filter) was used to improve the results of estimation for the model parameters. Then, to maximize likelihood function for parameter estimation, we used numerical optimization method- the simplex method for cycle calculation. Finally, the modeling results show that the degradation mechanism changes during the degradation of the motor with high speed. The effectiveness and rationality of model are verified by comparison of the life distribution with widely used nonlinear Wiener model, as well as a comparison of QQ plots for residual. Finally, predictions for motor life are gained by life distributions in different times calculated by multistage model.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2012-01-01
This paper focuses on two estimators of ability with logistic item response theory models: the Bayesian modal (BM) estimator and the weighted likelihood (WL) estimator. For the BM estimator, Jeffreys' prior distribution is considered, and the corresponding estimator is referred to as the Jeffreys modal (JM) estimator. It is established that under…
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
NASA Astrophysics Data System (ADS)
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Multiple-hit parameter estimation in monolithic detectors.
Hunter, William C J; Barrett, Harrison H; Lewellen, Tom K; Miyaoka, Robert S
2013-02-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%-12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied.
An evaluation of percentile and maximum likelihood estimators of weibull paremeters
Stanley J. Zarnoch; Tommy R. Dell
1985-01-01
Two methods of estimating the three-parameter Weibull distribution were evaluated by computer simulation and field data comparison. Maximum likelihood estimators (MLB) with bias correction were calculated with the computer routine FITTER (Bailey 1974); percentile estimators (PCT) were those proposed by Zanakis (1979). The MLB estimators had superior smaller bias and...
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.
ERIC Educational Resources Information Center
Blackwood, Larry G.; Bradley, Edwin L.
1989-01-01
Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
ERIC Educational Resources Information Center
Yang, Xiangdong; Poggio, John C.; Glasnapp, Douglas R.
2006-01-01
The effects of five ability estimators, that is, maximum likelihood estimator, weighted likelihood estimator, maximum a posteriori, expected a posteriori, and Owen's sequential estimator, on the performances of the item response theory-based adaptive classification procedure on multiple categories were studied via simulations. The following…
Efficient Exploration of the Space of Reconciled Gene Trees
Szöllősi, Gergely J.; Rosikiewicz, Wojciech; Boussau, Bastien; Tannier, Eric; Daubin, Vincent
2013-01-01
Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree–species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree–species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllősi et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. [amalgamation; gene tree reconciliation; gene tree reconstruction; lateral gene transfer; phylogeny.] PMID:23925510
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.
Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman
2017-03-01
We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
SubspaceEM: A Fast Maximum-a-posteriori Algorithm for Cryo-EM Single Particle Reconstruction
Dvornek, Nicha C.; Sigworth, Fred J.; Tagare, Hemant D.
2015-01-01
Single particle reconstruction methods based on the maximum-likelihood principle and the expectation-maximization (E–M) algorithm are popular because of their ability to produce high resolution structures. However, these algorithms are computationally very expensive, requiring a network of computational servers. To overcome this computational bottleneck, we propose a new mathematical framework for accelerating maximum-likelihood reconstructions. The speedup is by orders of magnitude and the proposed algorithm produces similar quality reconstructions compared to the standard maximum-likelihood formulation. Our approach uses subspace approximations of the cryo-electron microscopy (cryo-EM) data and projection images, greatly reducing the number of image transformations and comparisons that are computed. Experiments using simulated and actual cryo-EM data show that speedup in overall execution time compared to traditional maximum-likelihood reconstruction reaches factors of over 300. PMID:25839831
Statistical estimation via convex optimization for trending and performance monitoring
NASA Astrophysics Data System (ADS)
Samar, Sikandar
This thesis presents an optimization-based statistical estimation approach to find unknown trends in noisy data. A Bayesian framework is used to explicitly take into account prior information about the trends via trend models and constraints. The main focus is on convex formulation of the Bayesian estimation problem, which allows efficient computation of (globally) optimal estimates. There are two main parts of this thesis. The first part formulates trend estimation in systems described by known detailed models as a convex optimization problem. Statistically optimal estimates are then obtained by maximizing a concave log-likelihood function subject to convex constraints. We consider the problem of increasing problem dimension as more measurements become available, and introduce a moving horizon framework to enable recursive estimation of the unknown trend by solving a fixed size convex optimization problem at each horizon. We also present a distributed estimation framework, based on the dual decomposition method, for a system formed by a network of complex sensors with local (convex) estimation. Two specific applications of the convex optimization-based Bayesian estimation approach are described in the second part of the thesis. Batch estimation for parametric diagnostics in a flight control simulation of a space launch vehicle is shown to detect incipient fault trends despite the natural masking properties of feedback in the guidance and control loops. Moving horizon approach is used to estimate time varying fault parameters in a detailed nonlinear simulation model of an unmanned aerial vehicle. An excellent performance is demonstrated in the presence of winds and turbulence.
A unified partial likelihood approach for X-chromosome association on time-to-event outcomes.
Xu, Wei; Hao, Meiling
2018-02-01
The expression of X-chromosome undergoes three possible biological processes: X-chromosome inactivation (XCI), escape of the X-chromosome inactivation (XCI-E), and skewed X-chromosome inactivation (XCI-S). Although these expressions are included in various predesigned genetic variation chip platforms, the X-chromosome has generally been excluded from the majority of genome-wide association studies analyses; this is most likely due to the lack of a standardized method in handling X-chromosomal genotype data. To analyze the X-linked genetic association for time-to-event outcomes with the actual process unknown, we propose a unified approach of maximizing the partial likelihood over all of the potential biological processes. The proposed method can be used to infer the true biological process and derive unbiased estimates of the genetic association parameters. A partial likelihood ratio test statistic that has been proved asymptotically chi-square distributed can be used to assess the X-chromosome genetic association. Furthermore, if the X-chromosome expression pertains to the XCI-S process, we can infer the correct skewed direction and magnitude of inactivation, which can elucidate significant findings regarding the genetic mechanism. A population-level model and a more general subject-level model have been developed to model the XCI-S process. Finite sample performance of this novel method is examined via extensive simulation studies. An application is illustrated with implementation of the method on a cancer genetic study with survival outcome. © 2017 WILEY PERIODICALS, INC.
Estimating parameter of Rayleigh distribution by using Maximum Likelihood method and Bayes method
NASA Astrophysics Data System (ADS)
Ardianti, Fitri; Sutarman
2018-01-01
In this paper, we use Maximum Likelihood estimation and Bayes method under some risk function to estimate parameter of Rayleigh distribution to know the best method. The prior knowledge which used in Bayes method is Jeffrey’s non-informative prior. Maximum likelihood estimation and Bayes method under precautionary loss function, entropy loss function, loss function-L 1 will be compared. We compare these methods by bias and MSE value using R program. After that, the result will be displayed in tables to facilitate the comparisons.
Consistency of Rasch Model Parameter Estimation: A Simulation Study.
ERIC Educational Resources Information Center
van den Wollenberg, Arnold L.; And Others
1988-01-01
The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
2010-06-01
GMKPF represents a better and more flexible alternative to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ...accurate results relative to GML and EML when the network delays are modeled in terms of a single non-Gaussian/non-exponential distribution or as a...to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ) estimators for clock offset estimation in non-Gaussian or non
Maximum-likelihood estimation of parameterized wavefronts from multifocal data
Sakamoto, Julia A.; Barrett, Harrison H.
2012-01-01
A method for determining the pupil phase distribution of an optical system is demonstrated. Coefficients in a wavefront expansion were estimated using likelihood methods, where the data consisted of multiple irradiance patterns near focus. Proof-of-principle results were obtained in both simulation and experiment. Large-aberration wavefronts were handled in the numerical study. Experimentally, we discuss the handling of nuisance parameters. Fisher information matrices, Cramér-Rao bounds, and likelihood surfaces are examined. ML estimates were obtained by simulated annealing to deal with numerous local extrema in the likelihood function. Rapid processing techniques were employed to reduce the computational time. PMID:22772282
A unified framework for group independent component analysis for multi-subject fMRI data
Guo, Ying; Pagnoni, Giuseppe
2008-01-01
Independent component analysis (ICA) is becoming increasingly popular for analyzing functional magnetic resonance imaging (fMRI) data. While ICA has been successfully applied to single-subject analysis, the extension of ICA to group inferences is not straightforward and remains an active topic of research. Current group ICA models, such as the GIFT (Calhoun et al., 2001) and tensor PICA (Beckmann and Smith, 2005), make different assumptions about the underlying structure of the group spatio-temporal processes and are thus estimated using algorithms tailored for the assumed structure, potentially leading to diverging results. To our knowledge, there are currently no methods for assessing the validity of different model structures in real fMRI data and selecting the most appropriate one among various choices. In this paper, we propose a unified framework for estimating and comparing group ICA models with varying spatio-temporal structures. We consider a class of group ICA models that can accommodate different group structures and include existing models, such as the GIFT and tensor PICA, as special cases. We propose a maximum likelihood (ML) approach with a modified Expectation-Maximization (EM) algorithm for the estimation of the proposed class of models. Likelihood ratio tests (LRT) are presented to compare between different group ICA models. The LRT can be used to perform model comparison and selection, to assess the goodness-of-fit of a model in a particular data set, and to test group differences in the fMRI signal time courses between subject subgroups. Simulation studies are conducted to evaluate the performance of the proposed method under varying structures of group spatio-temporal processes. We illustrate our group ICA method using data from an fMRI study that investigates changes in neural processing associated with the regular practice of Zen meditation. PMID:18650105
A maximum likelihood method for high resolution proton radiography/proton CT
NASA Astrophysics Data System (ADS)
Collins-Fekete, Charles-Antoine; Brousmiche, Sébastien; Portillo, Stephen K. N.; Beaulieu, Luc; Seco, Joao
2016-12-01
Multiple Coulomb scattering (MCS) is the largest contributor to blurring in proton imaging. In this work, we developed a maximum likelihood least squares estimator that improves proton radiography’s spatial resolution. The water equivalent thickness (WET) through projections defined from the source to the detector pixels were estimated such that they maximizes the likelihood of the energy loss of every proton crossing the volume. The length spent in each projection was calculated through the optimized cubic spline path estimate. The proton radiographies were produced using Geant4 simulations. Three phantoms were studied here: a slanted cube in a tank of water to measure 2D spatial resolution, a voxelized head phantom for clinical performance evaluation as well as a parametric Catphan phantom (CTP528) for 3D spatial resolution. Two proton beam configurations were used: a parallel and a conical beam. Proton beams of 200 and 330 MeV were simulated to acquire the radiography. Spatial resolution is increased from 2.44 lp cm-1 to 4.53 lp cm-1 in the 200 MeV beam and from 3.49 lp cm-1 to 5.76 lp cm-1 in the 330 MeV beam. Beam configurations do not affect the reconstructed spatial resolution as investigated between a radiography acquired with the parallel (3.49 lp cm-1 to 5.76 lp cm-1) or conical beam (from 3.49 lp cm-1 to 5.56 lp cm-1). The improved images were then used as input in a photon tomography algorithm. The proton CT reconstruction of the Catphan phantom shows high spatial resolution (from 2.79 to 5.55 lp cm-1 for the parallel beam and from 3.03 to 5.15 lp cm-1 for the conical beam) and the reconstruction of the head phantom, although qualitative, shows high contrast in the gradient region. The proposed formulation of the optimization demonstrates serious potential to increase the spatial resolution (up by 65 % ) in proton radiography and greatly accelerate proton computed tomography reconstruction.
A maximum likelihood method for high resolution proton radiography/proton CT.
Collins-Fekete, Charles-Antoine; Brousmiche, Sébastien; Portillo, Stephen K N; Beaulieu, Luc; Seco, Joao
2016-12-07
Multiple Coulomb scattering (MCS) is the largest contributor to blurring in proton imaging. In this work, we developed a maximum likelihood least squares estimator that improves proton radiography's spatial resolution. The water equivalent thickness (WET) through projections defined from the source to the detector pixels were estimated such that they maximizes the likelihood of the energy loss of every proton crossing the volume. The length spent in each projection was calculated through the optimized cubic spline path estimate. The proton radiographies were produced using Geant4 simulations. Three phantoms were studied here: a slanted cube in a tank of water to measure 2D spatial resolution, a voxelized head phantom for clinical performance evaluation as well as a parametric Catphan phantom (CTP528) for 3D spatial resolution. Two proton beam configurations were used: a parallel and a conical beam. Proton beams of 200 and 330 MeV were simulated to acquire the radiography. Spatial resolution is increased from 2.44 lp cm -1 to 4.53 lp cm -1 in the 200 MeV beam and from 3.49 lp cm -1 to 5.76 lp cm -1 in the 330 MeV beam. Beam configurations do not affect the reconstructed spatial resolution as investigated between a radiography acquired with the parallel (3.49 lp cm -1 to 5.76 lp cm -1 ) or conical beam (from 3.49 lp cm -1 to 5.56 lp cm -1 ). The improved images were then used as input in a photon tomography algorithm. The proton CT reconstruction of the Catphan phantom shows high spatial resolution (from 2.79 to 5.55 lp cm -1 for the parallel beam and from 3.03 to 5.15 lp cm -1 for the conical beam) and the reconstruction of the head phantom, although qualitative, shows high contrast in the gradient region. The proposed formulation of the optimization demonstrates serious potential to increase the spatial resolution (up by 65[Formula: see text]) in proton radiography and greatly accelerate proton computed tomography reconstruction.
NASA Astrophysics Data System (ADS)
Chen, Siyue; Leung, Henry; Dondo, Maxwell
2014-05-01
As computer network security threats increase, many organizations implement multiple Network Intrusion Detection Systems (NIDS) to maximize the likelihood of intrusion detection and provide a comprehensive understanding of intrusion activities. However, NIDS trigger a massive number of alerts on a daily basis. This can be overwhelming for computer network security analysts since it is a slow and tedious process to manually analyse each alert produced. Thus, automated and intelligent clustering of alerts is important to reveal the structural correlation of events by grouping alerts with common features. As the nature of computer network attacks, and therefore alerts, is not known in advance, unsupervised alert clustering is a promising approach to achieve this goal. We propose a joint optimization technique for feature selection and clustering to aggregate similar alerts and to reduce the number of alerts that analysts have to handle individually. More precisely, each identified feature is assigned a binary value, which reflects the feature's saliency. This value is treated as a hidden variable and incorporated into a likelihood function for clustering. Since computing the optimal solution of the likelihood function directly is analytically intractable, we use the Expectation-Maximisation (EM) algorithm to iteratively update the hidden variable and use it to maximize the expected likelihood. Our empirical results, using a labelled Defense Advanced Research Projects Agency (DARPA) 2000 reference dataset, show that the proposed method gives better results than the EM clustering without feature selection in terms of the clustering accuracy.
Algorithms of maximum likelihood data clustering with applications
NASA Astrophysics Data System (ADS)
Giada, Lorenzo; Marsili, Matteo
2002-12-01
We address the problem of data clustering by introducing an unsupervised, parameter-free approach based on maximum likelihood principle. Starting from the observation that data sets belonging to the same cluster share a common information, we construct an expression for the likelihood of any possible cluster structure. The likelihood in turn depends only on the Pearson's coefficient of the data. We discuss clustering algorithms that provide a fast and reliable approximation to maximum likelihood configurations. Compared to standard clustering methods, our approach has the advantages that (i) it is parameter free, (ii) the number of clusters need not be fixed in advance and (iii) the interpretation of the results is transparent. In order to test our approach and compare it with standard clustering algorithms, we analyze two very different data sets: time series of financial market returns and gene expression data. We find that different maximization algorithms produce similar cluster structures whereas the outcome of standard algorithms has a much wider variability.
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures
ERIC Educational Resources Information Center
Jeon, Minjeong; Rabe-Hesketh, Sophia
2012-01-01
In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
Multiple-Hit Parameter Estimation in Monolithic Detectors
Barrett, Harrison H.; Lewellen, Tom K.; Miyaoka, Robert S.
2014-01-01
We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%–12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied. PMID:23193231
Unified framework to evaluate panmixia and migration direction among multiple sampling locations.
Beerli, Peter; Palczewski, Michal
2010-05-01
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
Profile-likelihood Confidence Intervals in Item Response Theory Models.
Chalmers, R Philip; Pek, Jolynn; Liu, Yang
2017-01-01
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data
ERIC Educational Resources Information Center
Savalei, Victoria
2010-01-01
Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown
ERIC Educational Resources Information Center
Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi
2014-01-01
When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
NASA Astrophysics Data System (ADS)
Sutawanir
2015-12-01
Mortality tables play important role in actuarial studies such as life annuities, premium determination, premium reserve, valuation pension plan, pension funding. Some known mortality tables are CSO mortality table, Indonesian Mortality Table, Bowers mortality table, Japan Mortality table. For actuary applications some tables are constructed with different environment such as single decrement, double decrement, and multiple decrement. There exist two approaches in mortality table construction : mathematics approach and statistical approach. Distribution model and estimation theory are the statistical concepts that are used in mortality table construction. This article aims to discuss the statistical approach in mortality table construction. The distributional assumptions are uniform death distribution (UDD) and constant force (exponential). Moment estimation and maximum likelihood are used to estimate the mortality parameter. Moment estimation methods are easier to manipulate compared to maximum likelihood estimation (mle). However, the complete mortality data are not used in moment estimation method. Maximum likelihood exploited all available information in mortality estimation. Some mle equations are complicated and solved using numerical methods. The article focus on single decrement estimation using moment and maximum likelihood estimation. Some extension to double decrement will introduced. Simple dataset will be used to illustrated the mortality estimation, and mortality table.
Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation
NASA Astrophysics Data System (ADS)
Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.
2016-12-01
With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
NASA Technical Reports Server (NTRS)
Thadani, S. G.
1977-01-01
The Maximum Likelihood Estimation of Signature Transformation (MLEST) algorithm is used to obtain maximum likelihood estimates (MLE) of affine transformation. The algorithm has been evaluated for three sets of data: simulated (training and recognition segment pairs), consecutive-day (data gathered from Landsat images), and geographical-extension (large-area crop inventory experiment) data sets. For each set, MLEST signature extension runs were made to determine MLE values and the affine-transformed training segment signatures were used to classify the recognition segments. The classification results were used to estimate wheat proportions at 0 and 1% threshold values.
Maximum-likelihood soft-decision decoding of block codes using the A* algorithm
NASA Technical Reports Server (NTRS)
Ekroot, L.; Dolinar, S.
1994-01-01
The A* algorithm finds the path in a finite depth binary tree that optimizes a function. Here, it is applied to maximum-likelihood soft-decision decoding of block codes where the function optimized over the codewords is the likelihood function of the received sequence given each codeword. The algorithm considers codewords one bit at a time, making use of the most reliable received symbols first and pursuing only the partially expanded codewords that might be maximally likely. A version of the A* algorithm for maximum-likelihood decoding of block codes has been implemented for block codes up to 64 bits in length. The efficiency of this algorithm makes simulations of codes up to length 64 feasible. This article details the implementation currently in use, compares the decoding complexity with that of exhaustive search and Viterbi decoding algorithms, and presents performance curves obtained with this implementation of the A* algorithm for several codes.
ERIC Educational Resources Information Center
Jones, Douglas H.
The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…
Computation of nonparametric convex hazard estimators via profile methods.
Jankowski, Hanna K; Wellner, Jon A
2009-05-01
This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females.
ERIC Educational Resources Information Center
Penfield, Randall D.; Bergeron, Jennifer M.
2005-01-01
This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
Rate of convergence of k-step Newton estimators to efficient likelihood estimators
Steve Verrill
2007-01-01
We make use of Cramer conditions together with the well-known local quadratic convergence of Newton?s method to establish the asymptotic closeness of k-step Newton estimators to efficient likelihood estimators. In Verrill and Johnson [2007. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. USDA Forest Products Laboratory Research...
Evaluating low pass filters on SPECT reconstructed cardiac orientation estimation
NASA Astrophysics Data System (ADS)
Dwivedi, Shekhar
2009-02-01
Low pass filters can affect the quality of clinical SPECT images by smoothing. Appropriate filter and parameter selection leads to optimum smoothing that leads to a better quantification followed by correct diagnosis and accurate interpretation by the physician. This study aims at evaluating the low pass filters on SPECT reconstruction algorithms. Criteria for evaluating the filters are estimating the SPECT reconstructed cardiac azimuth and elevation angle. Low pass filters studied are butterworth, gaussian, hamming, hanning and parzen. Experiments are conducted using three reconstruction algorithms, FBP (filtered back projection), MLEM (maximum likelihood expectation maximization) and OSEM (ordered subsets expectation maximization), on four gated cardiac patient projections (two patients with stress and rest projections). Each filter is applied with varying cutoff and order for each reconstruction algorithm (only butterworth used for MLEM and OSEM). The azimuth and elevation angles are calculated from the reconstructed volume and the variation observed in the angles with varying filter parameters is reported. Our results demonstrate that behavior of hamming, hanning and parzen filter (used with FBP) with varying cutoff is similar for all the datasets. Butterworth filter (cutoff > 0.4) behaves in a similar fashion for all the datasets using all the algorithms whereas with OSEM for a cutoff < 0.4, it fails to generate cardiac orientation due to oversmoothing, and gives an unstable response with FBP and MLEM. This study on evaluating effect of low pass filter cutoff and order on cardiac orientation using three different reconstruction algorithms provides an interesting insight into optimal selection of filter parameters.
Can, Seda; van de Schoot, Rens; Hox, Joop
2015-06-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions.
Valsecchi, M G; Silvestri, D; Sasieni, P
1996-12-30
We consider methodological problems in evaluating long-term survival in clinical trials. In particular we examine the use of several methods that extend the basic Cox regression analysis. In the presence of a long term observation, the proportional hazard (PH) assumption may easily be violated and a few long term survivors may have a large effect on parameter estimates. We consider both model selection and robust estimation in a data set of 474 ovarian cancer patients enrolled in a clinical trial and followed for between 7 and 12 years after randomization. Two diagnostic plots for assessing goodness-of-fit are introduced. One shows the variation in time of parameter estimates and is an alternative to PH checking based on time-dependent covariates. The other takes advantage of the martingale residual process in time to represent the lack of fit with a metric of the type 'observed minus expected' number of events. Robust estimation is carried out by maximizing a weighted partial likelihood which downweights the contribution to estimation of influential observations. This type of complementary analysis of long-term results of clinical studies is useful in assessing the soundness of the conclusions on treatment effect. In the example analysed here, the difference in survival between treatments was mostly confined to those individuals who survived at least two years beyond randomization.
An Improved Nested Sampling Algorithm for Model Selection and Assessment
NASA Astrophysics Data System (ADS)
Zeng, X.; Ye, M.; Wu, J.; WANG, D.
2017-12-01
Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
Time series modeling by a regression approach based on a latent process.
Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice
2009-01-01
Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.
Statistical fusion of continuous labels: identification of cardiac landmarks
NASA Astrophysics Data System (ADS)
Xing, Fangxu; Soleimanifard, Sahar; Prince, Jerry L.; Landman, Bennett A.
2011-03-01
Image labeling is an essential task for evaluating and analyzing morphometric features in medical imaging data. Labels can be obtained by either human interaction or automated segmentation algorithms. However, both approaches for labeling suffer from inevitable error due to noise and artifact in the acquired data. The Simultaneous Truth And Performance Level Estimation (STAPLE) algorithm was developed to combine multiple rater decisions and simultaneously estimate unobserved true labels as well as each rater's level of performance (i.e., reliability). A generalization of STAPLE for the case of continuous-valued labels has also been proposed. In this paper, we first show that with the proposed Gaussian distribution assumption, this continuous STAPLE formulation yields equivalent likelihoods for the bias parameter, meaning that the bias parameter-one of the key performance indices-is actually indeterminate. We resolve this ambiguity by augmenting the STAPLE expectation maximization formulation to include a priori probabilities on the performance level parameters, which enables simultaneous, meaningful estimation of both the rater bias and variance performance measures. We evaluate and demonstrate the efficacy of this approach in simulations and also through a human rater experiment involving the identification the intersection points of the right ventricle to the left ventricle in CINE cardiac data.
Wang, Zhu; Ma, Shuangge; Wang, Ching-Yun
2015-09-01
In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero-inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a penalty including the least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD), and minimax concave penalty (MCP). An EM (expectation-maximization) algorithm is proposed for estimating the model parameters and conducting variable selection simultaneously. This algorithm consists of estimating penalized weighted negative binomial models and penalized logistic models via the coordinated descent algorithm. Furthermore, statistical properties including the standard error formulae are provided. A simulation study shows that the new algorithm not only has more accurate or at least comparable estimation, but also is more robust than the traditional stepwise variable selection. The proposed methods are applied to analyze the health care demand in Germany using the open-source R package mpath. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Statistical Fusion of Continuous Labels: Identification of Cardiac Landmarks.
Xing, Fangxu; Soleimanifard, Sahar; Prince, Jerry L; Landman, Bennett A
2011-01-01
Image labeling is an essential task for evaluating and analyzing morphometric features in medical imaging data. Labels can be obtained by either human interaction or automated segmentation algorithms. However, both approaches for labeling suffer from inevitable error due to noise and artifact in the acquired data. The Simultaneous Truth And Performance Level Estimation (STAPLE) algorithm was developed to combine multiple rater decisions and simultaneously estimate unobserved true labels as well as each rater's level of performance (i.e., reliability). A generalization of STAPLE for the case of continuous-valued labels has also been proposed. In this paper, we first show that with the proposed Gaussian distribution assumption, this continuous STAPLE formulation yields equivalent likelihoods for the bias parameter, meaning that the bias parameter-one of the key performance indices-is actually indeterminate. We resolve this ambiguity by augmenting the STAPLE expectation maximization formulation to include a priori probabilities on the performance level parameters, which enables simultaneous, meaningful estimation of both the rater bias and variance performance measures. We evaluate and demonstrate the efficacy of this approach in simulations and also through a human rater experiment involving the identification the intersection points of the right ventricle to the left ventricle in CINE cardiac data.
Durstewitz, Daniel
2017-06-01
The computational and cognitive properties of neural systems are often thought to be implemented in terms of their (stochastic) network dynamics. Hence, recovering the system dynamics from experimentally observed neuronal time series, like multiple single-unit recordings or neuroimaging data, is an important step toward understanding its computations. Ideally, one would not only seek a (lower-dimensional) state space representation of the dynamics, but would wish to have access to its statistical properties and their generative equations for in-depth analysis. Recurrent neural networks (RNNs) are a computationally powerful and dynamically universal formal framework which has been extensively studied from both the computational and the dynamical systems perspective. Here we develop a semi-analytical maximum-likelihood estimation scheme for piecewise-linear RNNs (PLRNNs) within the statistical framework of state space models, which accounts for noise in both the underlying latent dynamics and the observation process. The Expectation-Maximization algorithm is used to infer the latent state distribution, through a global Laplace approximation, and the PLRNN parameters iteratively. After validating the procedure on toy examples, and using inference through particle filters for comparison, the approach is applied to multiple single-unit recordings from the rodent anterior cingulate cortex (ACC) obtained during performance of a classical working memory task, delayed alternation. Models estimated from kernel-smoothed spike time data were able to capture the essential computational dynamics underlying task performance, including stimulus-selective delay activity. The estimated models were rarely multi-stable, however, but rather were tuned to exhibit slow dynamics in the vicinity of a bifurcation point. In summary, the present work advances a semi-analytical (thus reasonably fast) maximum-likelihood estimation framework for PLRNNs that may enable to recover relevant aspects of the nonlinear dynamics underlying observed neuronal time series, and directly link these to computational properties.
Multiple robustness in factorized likelihood models.
Molina, J; Rotnitzky, A; Sued, M; Robins, J M
2017-09-01
We consider inference under a nonparametric or semiparametric model with likelihood that factorizes as the product of two or more variation-independent factors. We are interested in a finite-dimensional parameter that depends on only one of the likelihood factors and whose estimation requires the auxiliary estimation of one or several nuisance functions. We investigate general structures conducive to the construction of so-called multiply robust estimating functions, whose computation requires postulating several dimension-reducing models but which have mean zero at the true parameter value provided one of these models is correct.
ERIC Educational Resources Information Center
Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun
2002-01-01
Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
Heesakkers, John; Gerretsen, Reza; Izeta, Ander; Sievert, Karl-Dietrich; Farag, Fawzy
2016-02-01
The diagnosis of intrinsic sphincter deficiency (ISD) in patients with stress urinary incontinence (SUI) is not well established. We explored the possibility of applying a new tool: minimally invasive circumferential sphincter surface electromyography (CSS-EMG) to assess the muscular integrity of the urethral sphincter in patients with SUI/ISD. CSS-EMG of the urethral sphincter and urodynamic studies were performed in 44 women with SUI. A urethral pressure profile (UPP) was measured in four directions. Maximal urethral closure pressure (MUCP) <40 cm/H2 O or the presence of SUI without urethral hypermobility was used to define ISD. Twenty-one patients had urodynamic SUI, 23 had no SUI and 12 patients had ISD. The mean average rectified value (ARV) of the motor unit action potential (MUAP), an indicator of the strength of urethral rhabdosphincter, was estimated. ARV measured in the 12 o'clock quadrant during maximal contraction was the only CSS-EMG parameter that had significant predictive value for ISD. With an increase in the 12 o'clock ARV value, the likelihood of ISD decreases (Odds Ratio 0.36 95% confidence interval 0.67-0.92). In the ROC curve with ARV measured in the 12 o'clock quadrant during maximal contraction, the explained area was 0.794 (P = 0.02); implying that ARV measured at the 12 o'clock quadrant during maximal contraction was able to predict ISD significantly. Myogenic changes of the urethral sphincter that contribute to ISD can be assessed with CSS-EMG. This new concept for assessing the functionality of the female urethral sphincter may assist with better understanding of the pathophysiology, the diagnosis and the treatment of SUI. © 2014 Wiley Periodicals, Inc.
Williams, Robert C; Elston, Robert C; Kumar, Pankaj; Knowler, William C; Abboud, Hanna E; Adler, Sharon; Bowden, Donald W; Divers, Jasmin; Freedman, Barry I; Igo, Robert P; Ipp, Eli; Iyengar, Sudha K; Kimmel, Paul L; Klag, Michael J; Kohn, Orly; Langefeld, Carl D; Leehey, David J; Nelson, Robert G; Nicholas, Susanne B; Pahl, Madeleine V; Parekh, Rulan S; Rotter, Jerome I; Schelling, Jeffrey R; Sedor, John R; Shah, Vallabh O; Smith, Michael W; Taylor, Kent D; Thameem, Farook; Thornley-Brown, Denyse; Winkler, Cheryl A; Guo, Xiuqing; Zager, Phillip; Hanson, Robert L
2016-05-04
The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.
NASA Astrophysics Data System (ADS)
Lee, H.; Sheen, D.; Kim, S.
2013-12-01
The b-value in Gutenberg-Richter relation is an important parameter widely used not only in the interpretation of regional tectonic structure but in the seismic hazard analysis. In this study, we tested four methods for estimating the stable b-value in a small number of events using Monte-Carlo method. One is the Least-Squares method (LSM) which minimizes the observation error. Others are based on the Maximum Likelihood method (MLM) which maximizes the likelihood function: Utsu's (1965) method for continuous magnitudes and an infinite maximum magnitude, Page's (1968) for continuous magnitudes and a finite maximum magnitude, and Weichert's (1980) for interval magnitude and a finite maximum magnitude. A synthetic parent population of the earthquake catalog of million events from magnitude 2.0 to 7.0 with interval of 0.1 was generated for the Monte-Carlo simulation. The sample, the number of which was increased from 25 to 1000, was extracted from the parent population randomly. The resampling procedure was applied 1000 times with different random seed numbers. The mean and the standard deviation of the b-value were estimated for each sample group that has the same number of samples. As expected, the more samples were used, the more stable b-value was obtained. However, in a small number of events, the LSM gave generally low b-value with a large standard deviation while other MLMs gave more accurate and stable values. It was found that Utsu (1965) gives the most accurate and stable b-value even in a small number of events. It was also found that the selection of the minimum magnitude could be critical for estimating the correct b-value for Utsu's (1965) method and Page's (1968) if magnitudes were binned into an interval. Therefore, we applied Utsu (1965) to estimate the b-value using two instrumental earthquake catalogs, which have events occurred around the southern part of the Korean Peninsula from 1978 to 2011. By a careful choice of the minimum magnitude, the b-values of the earthquake catalogs of the Korea Meteorological Administration and Kim (2012) are estimated to be 0.72 and 0.74, respectively.
Statistical methods for biodosimetry in the presence of both Berkson and classical measurement error
NASA Astrophysics Data System (ADS)
Miller, Austin
In radiation epidemiology, the true dose received by those exposed cannot be assessed directly. Physical dosimetry uses a deterministic function of the source term, distance and shielding to estimate dose. For the atomic bomb survivors, the physical dosimetry system is well established. The classical measurement errors plaguing the location and shielding inputs to the physical dosimetry system are well known. Adjusting for the associated biases requires an estimate for the classical measurement error variance, for which no data-driven estimate exists. In this case, an instrumental variable solution is the most viable option to overcome the classical measurement error indeterminacy. Biological indicators of dose may serve as instrumental variables. Specification of the biodosimeter dose-response model requires identification of the radiosensitivity variables, for which we develop statistical definitions and variables. More recently, researchers have recognized Berkson error in the dose estimates, introduced by averaging assumptions for many components in the physical dosimetry system. We show that Berkson error induces a bias in the instrumental variable estimate of the dose-response coefficient, and then address the estimation problem. This model is specified by developing an instrumental variable mixed measurement error likelihood function, which is then maximized using a Monte Carlo EM Algorithm. These methods produce dose estimates that incorporate information from both physical and biological indicators of dose, as well as the first instrumental variable based data-driven estimate for the classical measurement error variance.
Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal
2012-01-01
Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.
Examining the effect of initialization strategies on the performance of Gaussian mixture modeling.
Shireman, Emilie; Steinley, Douglas; Brusco, Michael J
2017-02-01
Mixture modeling is a popular technique for identifying unobserved subpopulations (e.g., components) within a data set, with Gaussian (normal) mixture modeling being the form most widely used. Generally, the parameters of these Gaussian mixtures cannot be estimated in closed form, so estimates are typically obtained via an iterative process. The most common estimation procedure is maximum likelihood via the expectation-maximization (EM) algorithm. Like many approaches for identifying subpopulations, finite mixture modeling can suffer from locally optimal solutions, and the final parameter estimates are dependent on the initial starting values of the EM algorithm. Initial values have been shown to significantly impact the quality of the solution, and researchers have proposed several approaches for selecting the set of starting values. Five techniques for obtaining starting values that are implemented in popular software packages are compared. Their performances are assessed in terms of the following four measures: (1) the ability to find the best observed solution, (2) settling on a solution that classifies observations correctly, (3) the number of local solutions found by each technique, and (4) the speed at which the start values are obtained. On the basis of these results, a set of recommendations is provided to the user.
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.
Scheike, Thomas H; Juul, Anders
2004-04-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
Five Methods for Estimating Angoff Cut Scores with IRT
ERIC Educational Resources Information Center
Wyse, Adam E.
2017-01-01
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.
2017-11-01
This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Workload, Performance, and Reliability of Digital Computing Systems.
1980-12-01
maximized is then P()(tf, ...... , ad .[[. P(ts) h( tf1 ),) (0 42.33) Note that this problem is equivalent to minimazing the function En Jn ,l~~f,) (-X) 1...have been observed at times tf1 , after observing the system since tsl. Take, for instance, the case of the distribution obtained from the simplified...represented by a set of pairs [ts,, tf1 ] i 1 ...... n the maximum likelihood values of a,y,P are these values that maximize the function p(n)( tf1 -ts 1 . . ts
Explaining the effect of event valence on unrealistic optimism.
Gold, Ron S; Brown, Mark G
2009-05-01
People typically exhibit 'unrealistic optimism' (UO): they believe they have a lower chance of experiencing negative events and a higher chance of experiencing positive events than does the average person. UO has been found to be greater for negative than positive events. This 'valence effect' has been explained in terms of motivational processes. An alternative explanation is provided by the 'numerosity model', which views the valence effect simply as a by-product of a tendency for likelihood estimates pertaining to the average member of a group to increase with the size of the group. Predictions made by the numerosity model were tested in two studies. In each, UO for a single event was assessed. In Study 1 (n = 115 students), valence was manipulated by framing the event either negatively or positively, and participants estimated their own likelihood and that of the average student at their university. In Study 2 (n = 139 students), valence was again manipulated and participants again estimated their own likelihood; additionally, group size was manipulated by having participants estimate the likelihood of the average student in a small, medium-sized, or large group. In each study, the valence effect was found, but was due to an effect on estimates of own likelihood, not the average person's likelihood. In Study 2, valence did not interact with group size. The findings contradict the numerosity model, but are in accord with the motivational explanation. Implications for health education are discussed.
Kondo, Yumi; Zhao, Yinshan; Petkau, John
2017-05-30
Identification of treatment responders is a challenge in comparative studies where treatment efficacy is measured by multiple longitudinally collected continuous and count outcomes. Existing procedures often identify responders on the basis of only a single outcome. We propose a novel multiple longitudinal outcome mixture model that assumes that, conditionally on a cluster label, each longitudinal outcome is from a generalized linear mixed effect model. We utilize a Monte Carlo expectation-maximization algorithm to obtain the maximum likelihood estimates of our high-dimensional model and classify patients according to their estimated posterior probability of being a responder. We demonstrate the flexibility of our novel procedure on two multiple sclerosis clinical trial datasets with distinct data structures. Our simulation study shows that incorporating multiple outcomes improves the responder identification performance; this can occur even if some of the outcomes are ineffective. Our general procedure facilitates the identification of responders who are comprehensively defined by multiple outcomes from various distributions. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A flexible cure rate model with dependent censoring and a known cure threshold.
Bernhardt, Paul W
2016-11-10
We propose a flexible cure rate model that accommodates different censoring distributions for the cured and uncured groups and also allows for some individuals to be observed as cured when their survival time exceeds a known threshold. We model the survival times for the uncured group using an accelerated failure time model with errors distributed according to the seminonparametric distribution, potentially truncated at a known threshold. We suggest a straightforward extension of the usual expectation-maximization algorithm approach for obtaining estimates in cure rate models to accommodate the cure threshold and dependent censoring. We additionally suggest a likelihood ratio test for testing for the presence of dependent censoring in the proposed cure rate model. We show through numerical studies that our model has desirable properties and leads to approximately unbiased parameter estimates in a variety of scenarios. To demonstrate how our method performs in practice, we analyze data from a bone marrow transplantation study and a liver transplant study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Adolescents' Sexual Behavior and Academic Attainment
ERIC Educational Resources Information Center
Frisco, Michelle L.
2008-01-01
High school students have high ambitions but do not always make choices that maximize their likelihood of educational success. This was the motivation for investigating the relationships between high school sexual behavior and two important milestones in academic attainment: earning a high school diploma and enrolling in distinct postsecondary…
76 FR 77452 - Advisory Circular for Stall and Stick Pusher Training
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-13
... a recovery from a stall or approach. Scenario-based training that includes realistic events that... training, testing, and checking recommendations designed to maximize the likelihood that pilots will... circular was developed based on a review of recommended practices developed by major aircraft manufacturers...
NASA Astrophysics Data System (ADS)
Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza
2018-03-01
In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty
Baele, Guy; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
ERIC Educational Resources Information Center
Ramenzoni, Veronica; Riley, Michael A.; Davis, Tehran; Shockley, Kevin; Armstrong, Rachel
2008-01-01
Three experiments investigated the ability to perceive the maximum height to which another actor could jump to reach an object. Experiment 1 determined the accuracy of estimates for another actor's maximal reach-with-jump height and compared these estimates to estimates of the actor's standing maximal reaching height and to estimates of the…
Nagelkerke, Nico; Fidler, Vaclav
2015-01-01
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
NASA Technical Reports Server (NTRS)
Cash, W.
1979-01-01
Many problems in the experimental estimation of parameters for models can be solved through use of the likelihood ratio test. Applications of the likelihood ratio, with particular attention to photon counting experiments, are discussed. The procedures presented solve a greater range of problems than those currently in use, yet are no more difficult to apply. The procedures are proved analytically, and examples from current problems in astronomy are discussed.
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference
NASA Astrophysics Data System (ADS)
Hall, Alex; Taylor, Andy
2017-06-01
We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
Cosmological parameters from a re-analysis of the WMAP 7 year low-resolution maps
NASA Astrophysics Data System (ADS)
Finelli, F.; De Rosa, A.; Gruppuso, A.; Paoletti, D.
2013-06-01
Cosmological parameters from Wilkinson Microwave Anisotropy Probe (WMAP) 7 year data are re-analysed by substituting a pixel-based likelihood estimator to the one delivered publicly by the WMAP team. Our pixel-based estimator handles exactly intensity and polarization in a joint manner, allowing us to use low-resolution maps and noise covariance matrices in T, Q, U at the same resolution, which in this work is 3.6°. We describe the features and the performances of the code implementing our pixel-based likelihood estimator. We perform a battery of tests on the application of our pixel-based likelihood routine to WMAP publicly available low-resolution foreground-cleaned products, in combination with the WMAP high-ℓ likelihood, reporting the differences on cosmological parameters evaluated by the full WMAP likelihood public package. The differences are not only due to the treatment of polarization, but also to the marginalization over monopole and dipole uncertainties present in the WMAP pixel likelihood code for temperature. The credible central value for the cosmological parameters change below the 1σ level with respect to the evaluation by the full WMAP 7 year likelihood code, with the largest difference in a shift to smaller values of the scalar spectral index nS.
Identical twins in forensic genetics - Epidemiology and risk based estimation of weight of evidence.
Tvedebrink, Torben; Morling, Niels
2015-12-01
The increase in the number of forensic genetic loci used for identification purposes results in infinitesimal random match probabilities. These probabilities are computed under assumptions made for rather simple population genetic models. Often, the forensic expert reports likelihood ratios, where the alternative hypothesis is assumed not to encompass close relatives. However, this approach implies that important factors present in real human populations are discarded. This approach may be very unfavourable to the defendant. In this paper, we discuss some important aspects concerning the closest familial relationship, i.e., identical (monozygotic) twins, when reporting the weight of evidence. This can be done even when the suspect has no knowledge of an identical twin or when official records hold no twin information about the suspect. The derived expressions are not original as several authors previously have published results accounting for close familial relationships. However, we revisit the discussion to increase the awareness among forensic genetic practitioners and include new information on medical and societal factors to assess the risk of not considering a monozygotic twin as the true perpetrator. If accounting for a monozygotic twin in the weight of evidence, it implies that the likelihood ratio is truncated at a maximal value depending on the prevalence of monozygotic twins and the societal efficiency of recognising a monozygotic twin. If a monozygotic twin is considered as an alternative proposition, then data relevant for the Danish society suggests that the threshold of likelihood ratios should approximately be between 150,000 and 2,000,000 in order to take the risk of an unrecognised identical, monozygotic twin into consideration. In other societies, the threshold of the likelihood ratio in crime cases may reach other, often lower, values depending on the recognition of monozygotic twins and the age of the suspect. In general, more strictly kept registries will imply larger thresholds on the likelihood ratio as the monozygotic twin explanation gets less probable. Copyright © 2015 The Chartered Society of Forensic Sciences. Published by Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Grove, R. D.; Bowles, R. L.; Mayhew, S. C.
1972-01-01
A maximum likelihood parameter estimation procedure and program were developed for the extraction of the stability and control derivatives of aircraft from flight test data. Nonlinear six-degree-of-freedom equations describing aircraft dynamics were used to derive sensitivity equations for quasilinearization. The maximum likelihood function with quasilinearization was used to derive the parameter change equations, the covariance matrices for the parameters and measurement noise, and the performance index function. The maximum likelihood estimator was mechanized into an iterative estimation procedure utilizing a real time digital computer and graphic display system. This program was developed for 8 measured state variables and 40 parameters. Test cases were conducted with simulated data for validation of the estimation procedure and program. The program was applied to a V/STOL tilt wing aircraft, a military fighter airplane, and a light single engine airplane. The particular nonlinear equations of motion, derivation of the sensitivity equations, addition of accelerations into the algorithm, operational features of the real time digital system, and test cases are described.
Comparison of methods for H*(10) calculation from measured LaBr3(Ce) detector spectra.
Vargas, A; Cornejo, N; Camp, A
2018-07-01
The Universitat Politecnica de Catalunya (UPC) and the Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT) have evaluated methods based on stripping, conversion coefficients and Maximum Likelihood Estimation using Expectation Maximization (ML-EM) in calculating the H*(10) rates from photon pulse-height spectra acquired with a spectrometric LaBr 3 (Ce)(1.5″ × 1.5″) detector. There is a good agreement between results of the different H*(10) rate calculation methods using the spectra measured at the UPC secondary standard calibration laboratory in Barcelona. From the outdoor study at ESMERALDA station in Madrid, it can be concluded that the analysed methods provide results quite similar to those obtained with the reference RSS ionization chamber. In addition, the spectrometric detectors can also facilitate radionuclide identification. Copyright © 2018 Elsevier Ltd. All rights reserved.
Managing distribution changes in time series prediction
NASA Astrophysics Data System (ADS)
Matias, J. M.; Gonzalez-Manteiga, W.; Taboada, J.; Ordonez, C.
2006-07-01
When a problem is modeled statistically, a single distribution model is usually postulated that is assumed to be valid for the entire space. Nonetheless, this practice may be somewhat unrealistic in certain application areas, in which the conditions of the process that generates the data may change; as far as we are aware, however, no techniques have been developed to tackle this problem.This article proposes a technique for modeling and predicting this change in time series with a view to improving estimates and predictions. The technique is applied, among other models, to the hypernormal distribution recently proposed. When tested on real data from a range of stock market indices the technique produces better results that when a single distribution model is assumed to be valid for the entire period of time studied.Moreover, when a global model is postulated, it is highly recommended to select the hypernormal distribution parameter in the same likelihood maximization process.
Optimal thresholds for the estimation of area rain-rate moments by the threshold method
NASA Technical Reports Server (NTRS)
Short, David A.; Shimizu, Kunio; Kedem, Benjamin
1993-01-01
Optimization of the threshold method, achieved by determination of the threshold that maximizes the correlation between an area-average rain-rate moment and the area coverage of rain rates exceeding the threshold, is demonstrated empirically and theoretically. Empirical results for a sequence of GATE radar snapshots show optimal thresholds of 5 and 27 mm/h for the first and second moments, respectively. Theoretical optimization of the threshold method by the maximum-likelihood approach of Kedem and Pavlopoulos (1991) predicts optimal thresholds near 5 and 26 mm/h for lognormally distributed rain rates with GATE-like parameters. The agreement between theory and observations suggests that the optimal threshold can be understood as arising due to sampling variations, from snapshot to snapshot, of a parent rain-rate distribution. Optimal thresholds for gamma and inverse Gaussian distributions are also derived and compared.
A Hybrid of Deep Network and Hidden Markov Model for MCI Identification with Resting-State fMRI.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2015-10-01
In this paper, we propose a novel method for modelling functional dynamics in resting-state fMRI (rs-fMRI) for Mild Cognitive Impairment (MCI) identification. Specifically, we devise a hybrid architecture by combining Deep Auto-Encoder (DAE) and Hidden Markov Model (HMM). The roles of DAE and HMM are, respectively, to discover hierarchical non-linear relations among features, by which we transform the original features into a lower dimension space, and to model dynamic characteristics inherent in rs-fMRI, i.e. , internal state changes. By building a generative model with HMMs for each class individually, we estimate the data likelihood of a test subject as MCI or normal healthy control, based on which we identify the clinical label. In our experiments, we achieved the maximal accuracy of 81.08% with the proposed method, outperforming state-of-the-art methods in the literature.
A Hybrid of Deep Network and Hidden Markov Model for MCI Identification with Resting-State fMRI
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2015-01-01
In this paper, we propose a novel method for modelling functional dynamics in resting-state fMRI (rs-fMRI) for Mild Cognitive Impairment (MCI) identification. Specifically, we devise a hybrid architecture by combining Deep Auto-Encoder (DAE) and Hidden Markov Model (HMM). The roles of DAE and HMM are, respectively, to discover hierarchical non-linear relations among features, by which we transform the original features into a lower dimension space, and to model dynamic characteristics inherent in rs-fMRI, i.e., internal state changes. By building a generative model with HMMs for each class individually, we estimate the data likelihood of a test subject as MCI or normal healthy control, based on which we identify the clinical label. In our experiments, we achieved the maximal accuracy of 81.08% with the proposed method, outperforming state-of-the-art methods in the literature. PMID:27054199
Hock, Sabrina; Hasenauer, Jan; Theis, Fabian J
2013-01-01
Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters. We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes. As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.
Some Small Sample Results for Maximum Likelihood Estimation in Multidimensional Scaling.
ERIC Educational Resources Information Center
Ramsay, J. O.
1980-01-01
Some aspects of the small sample behavior of maximum likelihood estimates in multidimensional scaling are investigated with Monte Carlo techniques. In particular, the chi square test for dimensionality is examined and a correction for bias is proposed and evaluated. (Author/JKS)
A spatially explicit capture-recapture estimator for single-catch traps.
Distiller, Greg; Borchers, David L
2015-11-01
Single-catch traps are frequently used in live-trapping studies of small mammals. Thus far, a likelihood for single-catch traps has proven elusive and usually the likelihood for multicatch traps is used for spatially explicit capture-recapture (SECR) analyses of such data. Previous work found the multicatch likelihood to provide a robust estimator of average density. We build on a recently developed continuous-time model for SECR to derive a likelihood for single-catch traps. We use this to develop an estimator based on observed capture times and compare its performance by simulation to that of the multicatch estimator for various scenarios with nonconstant density surfaces. While the multicatch estimator is found to be a surprisingly robust estimator of average density, its performance deteriorates with high trap saturation and increasing density gradients. Moreover, it is found to be a poor estimator of the height of the detection function. By contrast, the single-catch estimators of density, distribution, and detection function parameters are found to be unbiased or nearly unbiased in all scenarios considered. This gain comes at the cost of higher variance. If there is no interest in interpreting the detection function parameters themselves, and if density is expected to be fairly constant over the survey region, then the multicatch estimator performs well with single-catch traps. However if accurate estimation of the detection function is of interest, or if density is expected to vary substantially in space, then there is merit in using the single-catch estimator when trap saturation is above about 60%. The estimator's performance is improved if care is taken to place traps so as to span the range of variables that affect animal distribution. As a single-catch likelihood with unknown capture times remains intractable for now, researchers using single-catch traps should aim to incorporate timing devices with their traps.
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE)
Boker, Steven M.; Brick, Timothy R.; Pritikin, Joshua N.; Wang, Yang; von Oertzen, Timo; Brown, Donald; Lach, John; Estabrook, Ryne; Hunter, Michael D.; Maes, Hermine H.; Neale, Michael C.
2015-01-01
Maintained Individual Data Distributed Likelihood Estimation (MIDDLE) is a novel paradigm for research in the behavioral, social, and health sciences. The MIDDLE approach is based on the seemingly-impossible idea that data can be privately maintained by participants and never revealed to researchers, while still enabling statistical models to be fit and scientific hypotheses tested. MIDDLE rests on the assumption that participant data should belong to, be controlled by, and remain in the possession of the participants themselves. Distributed likelihood estimation refers to fitting statistical models by sending an objective function and vector of parameters to each participants’ personal device (e.g., smartphone, tablet, computer), where the likelihood of that individual’s data is calculated locally. Only the likelihood value is returned to the central optimizer. The optimizer aggregates likelihood values from responding participants and chooses new vectors of parameters until the model converges. A MIDDLE study provides significantly greater privacy for participants, automatic management of opt-in and opt-out consent, lower cost for the researcher and funding institute, and faster determination of results. Furthermore, if a participant opts into several studies simultaneously and opts into data sharing, these studies automatically have access to individual-level longitudinal data linked across all studies. PMID:26717128
Zhan, Tingting; Chevoneva, Inna; Iglewicz, Boris
2010-01-01
The family of weighted likelihood estimators largely overlaps with minimum divergence estimators. They are robust to data contaminations compared to MLE. We define the class of generalized weighted likelihood estimators (GWLE), provide its influence function and discuss the efficiency requirements. We introduce a new truncated cubic-inverse weight, which is both first and second order efficient and more robust than previously reported weights. We also discuss new ways of selecting the smoothing bandwidth and weighted starting values for the iterative algorithm. The advantage of the truncated cubic-inverse weight is illustrated in a simulation study of three-components normal mixtures model with large overlaps and heavy contaminations. A real data example is also provided. PMID:20835375
Bayesian structural equation modeling in sport and exercise psychology.
Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus
2015-08-01
Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Generalized t-statistic for two-group classification.
Komori, Osamu; Eguchi, Shinto; Copas, John B
2015-06-01
In the classic discriminant model of two multivariate normal distributions with equal variance matrices, the linear discriminant function is optimal both in terms of the log likelihood ratio and in terms of maximizing the standardized difference (the t-statistic) between the means of the two distributions. In a typical case-control study, normality may be sensible for the control sample but heterogeneity and uncertainty in diagnosis may suggest that a more flexible model is needed for the cases. We generalize the t-statistic approach by finding the linear function which maximizes a standardized difference but with data from one of the groups (the cases) filtered by a possibly nonlinear function U. We study conditions for consistency of the method and find the function U which is optimal in the sense of asymptotic efficiency. Optimality may also extend to other measures of discriminatory efficiency such as the area under the receiver operating characteristic curve. The optimal function U depends on a scalar probability density function which can be estimated non-parametrically using a standard numerical algorithm. A lasso-like version for variable selection is implemented by adding L1-regularization to the generalized t-statistic. Two microarray data sets in the study of asthma and various cancers are used as motivating examples. © 2014, The International Biometric Society.
New estimates of the CMB angular power spectra from the WMAP 5 year low-resolution data
NASA Astrophysics Data System (ADS)
Gruppuso, A.; de Rosa, A.; Cabella, P.; Paci, F.; Finelli, F.; Natoli, P.; de Gasperis, G.; Mandolesi, N.
2009-11-01
A quadratic maximum likelihood (QML) estimator is applied to the Wilkinson Microwave Anisotropy Probe (WMAP) 5 year low-resolution maps to compute the cosmic microwave background angular power spectra (APS) at large scales for both temperature and polarization. Estimates and error bars for the six APS are provided up to l = 32 and compared, when possible, to those obtained by the WMAP team, without finding any inconsistency. The conditional likelihood slices are also computed for the Cl of all the six power spectra from l = 2 to 10 through a pixel-based likelihood code. Both the codes treat the covariance for (T, Q, U) in a single matrix without employing any approximation. The inputs of both the codes (foreground-reduced maps, related covariances and masks) are provided by the WMAP team. The peaks of the likelihood slices are always consistent with the QML estimates within the error bars; however, an excellent agreement occurs when the QML estimates are used as a fiducial power spectrum instead of the best-fitting theoretical power spectrum. By the full computation of the conditional likelihood on the estimated spectra, the value of the temperature quadrupole CTTl=2 is found to be less than 2σ away from the WMAP 5 year Λ cold dark matter best-fitting value. The BB spectrum is found to be well consistent with zero, and upper limits on the B modes are provided. The parity odd signals TB and EB are found to be consistent with zero.
Donato, David I.
2012-01-01
This report presents the mathematical expressions and the computational techniques required to compute maximum-likelihood estimates for the parameters of the National Descriptive Model of Mercury in Fish (NDMMF), a statistical model used to predict the concentration of methylmercury in fish tissue. The expressions and techniques reported here were prepared to support the development of custom software capable of computing NDMMF parameter estimates more quickly and using less computer memory than is currently possible with available general-purpose statistical software. Computation of maximum-likelihood estimates for the NDMMF by numerical solution of a system of simultaneous equations through repeated Newton-Raphson iterations is described. This report explains the derivation of the mathematical expressions required for computational parameter estimation in sufficient detail to facilitate future derivations for any revised versions of the NDMMF that may be developed.
Maximum Likelihood Estimation of Nonlinear Structural Equation Models.
ERIC Educational Resources Information Center
Lee, Sik-Yum; Zhu, Hong-Tu
2002-01-01
Developed an EM type algorithm for maximum likelihood estimation of a general nonlinear structural equation model in which the E-step is completed by a Metropolis-Hastings algorithm. Illustrated the methodology with results from a simulation study and two real examples using data from previous studies. (SLD)
A New Monte Carlo Method for Estimating Marginal Likelihoods.
Wang, Yu-Bo; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O
2018-06-01
Evaluating the marginal likelihood in Bayesian analysis is essential for model selection. Estimators based on a single Markov chain Monte Carlo sample from the posterior distribution include the harmonic mean estimator and the inflated density ratio estimator. We propose a new class of Monte Carlo estimators based on this single Markov chain Monte Carlo sample. This class can be thought of as a generalization of the harmonic mean and inflated density ratio estimators using a partition weighted kernel (likelihood times prior). We show that our estimator is consistent and has better theoretical properties than the harmonic mean and inflated density ratio estimators. In addition, we provide guidelines on choosing optimal weights. Simulation studies were conducted to examine the empirical performance of the proposed estimator. We further demonstrate the desirable features of the proposed estimator with two real data sets: one is from a prostate cancer study using an ordinal probit regression model with latent variables; the other is for the power prior construction from two Eastern Cooperative Oncology Group phase III clinical trials using the cure rate survival model with similar objectives.
Aisbett, B; Le Rossignol, P
2003-09-01
The VO2-power regression and estimated total energy demand for a 6-minute supra-maximal exercise test was predicted from a continuous incremental exercise test. Sub-maximal VO2-power co-ordinates were established from the last 40 seconds (s) of 150-second exercise stages. The precision of the estimated total energy demand was determined using the 95% confidence interval (95% CI) of the estimated total energy demand. The linearity of the individual VO2-power regression equations was determined using Pearson's correlation coefficient. The mean 95% CI of the estimated total energy demand was 5.9 +/- 2.5 mL O2 Eq x kg(-1) x min(-1), and the mean correlation coefficient was 0.9942 +/- 0.0042. The current study contends that the sub-maximal VO2-power co-ordinates from a continuous incremental exercise test can be used to estimate supra-maximal energy demand without compromising the precision of the accumulated oxygen deficit (AOD) method.
Mixture Rasch Models with Joint Maximum Likelihood Estimation
ERIC Educational Resources Information Center
Willse, John T.
2011-01-01
This research provides a demonstration of the utility of mixture Rasch models. Specifically, a model capable of estimating a mixture partial credit model using joint maximum likelihood is presented. Like the partial credit model, the mixture partial credit model has the beneficial feature of being appropriate for analysis of assessment data…
Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood e...
The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.
ERIC Educational Resources Information Center
Baldwin, Beatrice
The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…
Likelihood-based confidence intervals for estimating floods with given return periods
NASA Astrophysics Data System (ADS)
Martins, Eduardo Sávio P. R.; Clarke, Robin T.
1993-06-01
This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.
3D image reconstruction algorithms for cryo-electron-microscopy images of virus particles
NASA Astrophysics Data System (ADS)
Doerschuk, Peter C.; Johnson, John E.
2000-11-01
A statistical model for the object and the complete image formation process in cryo electron microscopy of viruses is presented. Using this model, maximum likelihood reconstructions of the 3D structure of viruses are computed using the expectation maximization algorithm and an example based on Cowpea mosaic virus is provided.
ERIC Educational Resources Information Center
Weissman, Alexander
2013-01-01
Convergence of the expectation-maximization (EM) algorithm to a global optimum of the marginal log likelihood function for unconstrained latent variable models with categorical indicators is presented. The sufficient conditions under which global convergence of the EM algorithm is attainable are provided in an information-theoretic context by…
Toward a New Diversity: Guidelines for a Staff Diversity/Affirmative Action Plan.
ERIC Educational Resources Information Center
California Community Colleges, Sacramento. Office of the Chancellor.
These guidelines for California's community colleges specify required elements of a staff diversity/affirmative action plan, recommend sound practices and activities that will maximize the likelihood of success, and provide information on, and required elements of, related issues such as sexual harassment, handicap discrimination, and AIDS in the…
The striking similarities between standard, distractor-free, and target-free recognition
Dobbins, Ian G.
2012-01-01
It is often assumed that observers seek to maximize correct responding during recognition testing by actively adjusting a decision criterion. However, early research by Wallace (Journal of Experimental Psychology: Human Learning and Memory 4:441–452, 1978) suggested that recognition rates for studied items remained similar, regardless of whether or not the tests contained distractor items. We extended these findings across three experiments, addressing whether detection rates or observer confidence changed when participants were presented standard tests (targets and distractors) versus “pure-list” tests (lists composed entirely of targets or distractors). Even when observers were made aware of the composition of the pure-list test, the endorsement rates and confidence patterns remained largely similar to those observed during standard testing, suggesting that observers are typically not striving to maximize the likelihood of success across the test. We discuss the implications for decision models that assume a likelihood ratio versus a strength decision axis, as well as the implications for prior findings demonstrating large criterion shifts using target probability manipulations. PMID:21476108
Approximate, computationally efficient online learning in Bayesian spiking neurons.
Kuhlmann, Levin; Hauser-Raspe, Michael; Manton, Jonathan H; Grayden, David B; Tapson, Jonathan; van Schaik, André
2014-03-01
Bayesian spiking neurons (BSNs) provide a probabilistic interpretation of how neurons perform inference and learning. Online learning in BSNs typically involves parameter estimation based on maximum-likelihood expectation-maximization (ML-EM) which is computationally slow and limits the potential of studying networks of BSNs. An online learning algorithm, fast learning (FL), is presented that is more computationally efficient than the benchmark ML-EM for a fixed number of time steps as the number of inputs to a BSN increases (e.g., 16.5 times faster run times for 20 inputs). Although ML-EM appears to converge 2.0 to 3.6 times faster than FL, the computational cost of ML-EM means that ML-EM takes longer to simulate to convergence than FL. FL also provides reasonable convergence performance that is robust to initialization of parameter estimates that are far from the true parameter values. However, parameter estimation depends on the range of true parameter values. Nevertheless, for a physiologically meaningful range of parameter values, FL gives very good average estimation accuracy, despite its approximate nature. The FL algorithm therefore provides an efficient tool, complementary to ML-EM, for exploring BSN networks in more detail in order to better understand their biological relevance. Moreover, the simplicity of the FL algorithm means it can be easily implemented in neuromorphic VLSI such that one can take advantage of the energy-efficient spike coding of BSNs.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
WE-H-207A-06: Hypoxia Quantification in Static PET Images: The Signal in the Noise
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keller, H; Yeung, I; Milosevic, M
2016-06-15
Purpose: Quantification of hypoxia from PET images is of considerable clinical interest. In the absence of dynamic PET imaging the hypoxic fraction (HF) of a tumor has to be estimated from voxel values of activity concentration of a radioactive hypoxia tracer. This work is part of an effort to standardize quantification of tumor hypoxic fraction from PET images. Methods: A simple hypoxia imaging model in the tumor was developed. The distribution of the tracer activity was described as the sum of two different probability distributions, one for the normoxic (and necrotic), the other for the hypoxic voxels. The widths ofmore » the distributions arise due to variability of the transport, tumor tissue inhomogeneity, tracer binding kinetics, and due to PET image noise. Quantification of HF was performed for various levels of variability using two different methodologies: a) classification thresholds between normoxic and hypoxic voxels based on a non-hypoxic surrogate (muscle), and b) estimation of the (posterior) probability distributions based on maximizing likelihood optimization that does not require a surrogate. Data from the hypoxia imaging model and from 27 cervical cancer patients enrolled in a FAZA PET study were analyzed. Results: In the model, where the true value of HF is known, thresholds usually underestimate the value for large variability. For the patients, a significant uncertainty of the HF values (an average intra-patient range of 17%) was caused by spatial non-uniformity of image noise which is a hallmark of all PET images. Maximum likelihood estimation (MLE) is able to directly optimize for the weights of both distributions, however, may suffer from poor optimization convergence. For some patients, MLE-based HF values showed significant differences to threshold-based HF-values. Conclusion: HF-values depend critically on the magnitude of the different sources of tracer uptake variability. A measure of confidence should also be reported.« less
Ting, Chih-Chung; Yu, Chia-Chen; Maloney, Laurence T.
2015-01-01
In Bayesian decision theory, knowledge about the probabilities of possible outcomes is captured by a prior distribution and a likelihood function. The prior reflects past knowledge and the likelihood summarizes current sensory information. The two combined (integrated) form a posterior distribution that allows estimation of the probability of different possible outcomes. In this study, we investigated the neural mechanisms underlying Bayesian integration using a novel lottery decision task in which both prior knowledge and likelihood information about reward probability were systematically manipulated on a trial-by-trial basis. Consistent with Bayesian integration, as sample size increased, subjects tended to weigh likelihood information more compared with prior information. Using fMRI in humans, we found that the medial prefrontal cortex (mPFC) correlated with the mean of the posterior distribution, a statistic that reflects the integration of prior knowledge and likelihood of reward probability. Subsequent analysis revealed that both prior and likelihood information were represented in mPFC and that the neural representations of prior and likelihood in mPFC reflected changes in the behaviorally estimated weights assigned to these different sources of information in response to changes in the environment. Together, these results establish the role of mPFC in prior-likelihood integration and highlight its involvement in representing and integrating these distinct sources of information. PMID:25632152
Morris, Katherine Ann; Deterding, Nicole M
2016-09-01
Social networks offer important emotional and instrumental support following natural disasters. However, displacement may geographically disperse network members, making it difficult to provide and receive support necessary for psychological recovery after trauma. We examine the association between distance to network members and post-traumatic stress using survey data, and identify potential mechanisms underlying this association using in-depth qualitative interviews. We use longitudinal, mixed-methods data from the Resilience in Survivors of Katrina (RISK) Project to capture the long-term effects of Hurricane Katrina on low-income mothers from New Orleans. Baseline surveys occurred approximately one year before the storm and follow-up surveys and in-depth interviews were conducted five years later. We use a sequential explanatory analytic design. With logistic regression, we estimate the association of geographic network dispersion with the likelihood of post-traumatic stress. With linear regressions, we estimate the association of network dispersion with the three post-traumatic stress sub-scales. Using maximal variation sampling, we use qualitative interview data to elaborate identified statistical associations. We find network dispersion is positively associated with the likelihood of post-traumatic stress, controlling for individual-level socio-demographic characteristics, exposure to hurricane-related trauma, perceived social support, and New Orleans residency. We identify two social-psychological mechanisms present in qualitative data: respondents with distant network members report a lack of deep belonging and a lack of mattering as they are unable to fulfill obligations to important distant ties. Results indicate the importance of physical proximity to emotionally-intimate network ties for long-term psychological recovery. Copyright © 2016 Elsevier Ltd. All rights reserved.
heterogeneous mixture distributions for multi-source extreme rainfall
NASA Astrophysics Data System (ADS)
Ouarda, T.; Shin, J.; Lee, T. S.
2013-12-01
Mixture distributions have been used to model hydro-meteorological variables showing mixture distributional characteristics, e.g. bimodality. Homogeneous mixture (HOM) distributions (e.g. Normal-Normal and Gumbel-Gumbel) have been traditionally applied to hydro-meteorological variables. However, there is no reason to restrict the mixture distribution as the combination of one identical type. It might be beneficial to characterize the statistical behavior of hydro-meteorological variables from the application of heterogeneous mixture (HTM) distributions such as Normal-Gamma. In the present work, we focus on assessing the suitability of HTM distributions for the frequency analysis of hydro-meteorological variables. In the present work, in order to estimate the parameters of HTM distributions, the meta-heuristic algorithm (Genetic Algorithm) is employed to maximize the likelihood function. In the present study, a number of distributions are compared, including the Gamma-Extreme value type-one (EV1) HTM distribution, the EV1-EV1 HOM distribution, and EV1 distribution. The proposed distribution models are applied to the annual maximum precipitation data in South Korea. The Akaike Information Criterion (AIC), the root mean squared errors (RMSE) and the log-likelihood are used as measures of goodness-of-fit of the tested distributions. Results indicate that the HTM distribution (Gamma-EV1) presents the best fitness. The HTM distribution shows significant improvement in the estimation of quantiles corresponding to the 20-year return period. It is shown that extreme rainfall in the coastal region of South Korea presents strong heterogeneous mixture distributional characteristics. Results indicate that HTM distributions are a good alternative for the frequency analysis of hydro-meteorological variables when disparate statistical characteristics are presented.
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
ERIC Educational Resources Information Center
Beauducel, Andre; Herzberg, Philipp Yorck
2006-01-01
This simulation study compared maximum likelihood (ML) estimation with weighted least squares means and variance adjusted (WLSMV) estimation. The study was based on confirmatory factor analyses with 1, 2, 4, and 8 factors, based on 250, 500, 750, and 1,000 cases, and on 5, 10, 20, and 40 variables with 2, 3, 4, 5, and 6 categories. There was no…
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling
Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.
2012-01-01
This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
SU-C-207A-01: A Novel Maximum Likelihood Method for High-Resolution Proton Radiography/proton CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Collins-Fekete, C; Centre Hospitalier University de Quebec, Quebec, QC; Mass General Hospital
2016-06-15
Purpose: Multiple Coulomb scattering is the largest contributor to blurring in proton imaging. Here we tested a maximum likelihood least squares estimator (MLLSE) to improve the spatial resolution of proton radiography (pRad) and proton computed tomography (pCT). Methods: The object is discretized into voxels and the average relative stopping power through voxel columns defined from the source to the detector pixels is optimized such that it maximizes the likelihood of the proton energy loss. The length spent by individual protons in each column is calculated through an optimized cubic spline estimate. pRad images were first produced using Geant4 simulations. Anmore » anthropomorphic head phantom and the Catphan line-pair module for 3-D spatial resolution were studied and resulting images were analyzed. Both parallel and conical beam have been investigated for simulated pRad acquisition. Then, experimental data of a pediatric head phantom (CIRS) were acquired using a recently completed experimental pCT scanner. Specific filters were applied on proton angle and energy loss data to remove proton histories that underwent nuclear interactions. The MTF10% (lp/mm) was used to evaluate and compare spatial resolution. Results: Numerical simulations showed improvement in the pRad spatial resolution for the parallel (2.75 to 6.71 lp/cm) and conical beam (3.08 to 5.83 lp/cm) reconstructed with the MLLSE compared to averaging detector pixel signals. For full tomographic reconstruction, the improved pRad were used as input into a simultaneous algebraic reconstruction algorithm. The Catphan pCT reconstruction based on the MLLSE-enhanced projection showed spatial resolution improvement for the parallel (2.83 to 5.86 lp/cm) and conical beam (3.03 to 5.15 lp/cm). The anthropomorphic head pCT displayed important contrast gains in high-gradient regions. Experimental results also demonstrated significant improvement in spatial resolution of the pediatric head radiography. Conclusion: The proposed MLLSE shows promising potential to increase the spatial resolution (up to 244%) in proton imaging.« less
ERIC Educational Resources Information Center
Klein, Andreas G.; Muthen, Bengt O.
2007-01-01
In this article, a nonlinear structural equation model is introduced and a quasi-maximum likelihood method for simultaneous estimation and testing of multiple nonlinear effects is developed. The focus of the new methodology lies on efficiency, robustness, and computational practicability. Monte-Carlo studies indicate that the method is highly…
Likelihood-Based Confidence Intervals in Exploratory Factor Analysis
ERIC Educational Resources Information Center
Oort, Frans J.
2011-01-01
In exploratory or unrestricted factor analysis, all factor loadings are free to be estimated. In oblique solutions, the correlations between common factors are free to be estimated as well. The purpose of this article is to show how likelihood-based confidence intervals can be obtained for rotated factor loadings and factor correlations, by…
Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth
ERIC Educational Resources Information Center
Jeon, Minjeong
2012-01-01
Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
ERIC Educational Resources Information Center
Adank, Patti
2012-01-01
The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…
He, Xin; Frey, Eric C
2006-08-01
Previously, we have developed a decision model for three-class receiver operating characteristic (ROC) analysis based on decision theory. The proposed decision model maximizes the expected decision utility under the assumption that incorrect decisions have equal utilities under the same hypothesis (equal error utility assumption). This assumption reduced the dimensionality of the "general" three-class ROC analysis and provided a practical figure-of-merit to evaluate the three-class task performance. However, it also limits the generality of the resulting model because the equal error utility assumption will not apply for all clinical three-class decision tasks. The goal of this study was to investigate the optimality of the proposed three-class decision model with respect to several other decision criteria. In particular, besides the maximum expected utility (MEU) criterion used in the previous study, we investigated the maximum-correctness (MC) (or minimum-error), maximum likelihood (ML), and Nyman-Pearson (N-P) criteria. We found that by making assumptions for both MEU and N-P criteria, all decision criteria lead to the previously-proposed three-class decision model. As a result, this model maximizes the expected utility under the equal error utility assumption, maximizes the probability of making correct decisions, satisfies the N-P criterion in the sense that it maximizes the sensitivity of one class given the sensitivities of the other two classes, and the resulting ROC surface contains the maximum likelihood decision operating point. While the proposed three-class ROC analysis model is not optimal in the general sense due to the use of the equal error utility assumption, the range of criteria for which it is optimal increases its applicability for evaluating and comparing a range of diagnostic systems.
Baele, Guy; Lemey, Philippe; Vansteelandt, Stijn
2013-03-06
Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation.
2013-01-01
Background Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model’s marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. Results We here assess the original ‘model-switch’ path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model’s marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. Conclusions We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation. PMID:23497171
Approximated maximum likelihood estimation in multifractal random walks
NASA Astrophysics Data System (ADS)
Løvsletten, O.; Rypdal, M.
2012-04-01
We present an approximated maximum likelihood method for the multifractal random walk processes of [E. Bacry , Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.64.026103 64, 026103 (2001)]. The likelihood is computed using a Laplace approximation and a truncation in the dependency structure for the latent volatility. The procedure is implemented as a package in the r computer language. Its performance is tested on synthetic data and compared to an inference approach based on the generalized method of moments. The method is applied to estimate parameters for various financial stock indices.
Estimation After a Group Sequential Trial.
Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert
2015-10-01
Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
Two models for evaluating landslide hazards
Davis, J.C.; Chung, C.-J.; Ohlmacher, G.C.
2006-01-01
Two alternative procedures for estimating landslide hazards were evaluated using data on topographic digital elevation models (DEMs) and bedrock lithologies in an area adjacent to the Missouri River in Atchison County, Kansas, USA. The two procedures are based on the likelihood ratio model but utilize different assumptions. The empirical likelihood ratio model is based on non-parametric empirical univariate frequency distribution functions under an assumption of conditional independence while the multivariate logistic discriminant model assumes that likelihood ratios can be expressed in terms of logistic functions. The relative hazards of occurrence of landslides were estimated by an empirical likelihood ratio model and by multivariate logistic discriminant analysis. Predictor variables consisted of grids containing topographic elevations, slope angles, and slope aspects calculated from a 30-m DEM. An integer grid of coded bedrock lithologies taken from digitized geologic maps was also used as a predictor variable. Both statistical models yield relative estimates in the form of the proportion of total map area predicted to already contain or to be the site of future landslides. The stabilities of estimates were checked by cross-validation of results from random subsamples, using each of the two procedures. Cell-by-cell comparisons of hazard maps made by the two models show that the two sets of estimates are virtually identical. This suggests that the empirical likelihood ratio and the logistic discriminant analysis models are robust with respect to the conditional independent assumption and the logistic function assumption, respectively, and that either model can be used successfully to evaluate landslide hazards. ?? 2006.
Li, Xiang; Kuk, Anthony Y C; Xu, Jinfeng
2014-12-10
Human biomonitoring of exposure to environmental chemicals is important. Individual monitoring is not viable because of low individual exposure level or insufficient volume of materials and the prohibitive cost of taking measurements from many subjects. Pooling of samples is an efficient and cost-effective way to collect data. Estimation is, however, complicated as individual values within each pool are not observed but are only known up to their average or weighted average. The distribution of such averages is intractable when the individual measurements are lognormally distributed, which is a common assumption. We propose to replace the intractable distribution of the pool averages by a Gaussian likelihood to obtain parameter estimates. If the pool size is large, this method produces statistically efficient estimates, but regardless of pool size, the method yields consistent estimates as the number of pools increases. An empirical Bayes (EB) Gaussian likelihood approach, as well as its Bayesian analog, is developed to pool information from various demographic groups by using a mixed-effect formulation. We also discuss methods to estimate the underlying mean-variance relationship and to select a good model for the means, which can be incorporated into the proposed EB or Bayes framework. By borrowing strength across groups, the EB estimator is more efficient than the individual group-specific estimator. Simulation results show that the EB Gaussian likelihood estimates outperform a previous method proposed for the National Health and Nutrition Examination Surveys with much smaller bias and better coverage in interval estimation, especially after correction of bias. Copyright © 2014 John Wiley & Sons, Ltd.
Offline Performance of the Filter Bank EEW Algorithm in the 2014 M6.0 South Napa Earthquake
NASA Astrophysics Data System (ADS)
Meier, M. A.; Heaton, T. H.; Clinton, J. F.
2014-12-01
Medium size events like the M6.0 South Napa earthquake are very challenging for EEW: the damage such events produce can be severe, but it is generally confined to relatively small zones around the epicenter and the shaking duration is short. This leaves a very short window for timely EEW alerts. Algorithms that wait for several stations to trigger before sending out EEW alerts are typically not fast enough for these kind of events because their blind zone (the zone where strong ground motions start before the warnings arrive) typically covers all or most of the area that experiences strong ground motions. At the same time, single station algorithms are often too unreliable to provide useful alerts. The filter bank EEW algorithm is a new algorithm that is designed to provide maximally accurate and precise earthquake parameter estimates with minimum data input, with the goal of producing reliable EEW alerts when only a very small number of stations have been reached by the p-wave. It combines the strengths of single station and network based algorithms in that it starts parameter estimates as soon as 0.5 seconds of data are available from the first station, but then perpetually incorporates additional data from the same or from any number of other stations. The algorithm analyzes the time dependent frequency content of real time waveforms with a filter bank. It then uses an extensive training data set to find earthquake records from the past that have had similar frequency content at a given time since the p-wave onset. The source parameters of the most similar events are used to parameterize a likelihood function for the source parameters of the ongoing event, which can then be maximized to find the most likely parameter estimates. Our preliminary results show that the filter bank EEW algorithm correctly estimated the magnitude of the South Napa earthquake to be ~M6 with only 1 second worth of data at the nearest station to the epicenter. This estimate is then confirmed when updates based on more data from stations at farther distances become available. Because these early estimates saturate at ~M6.5, however, the magnitude estimate might have had to be considered a minimum bound.
Harbert, Robert S; Nixon, Kevin C
2015-08-01
• Plant distributions have long been understood to be correlated with the environmental conditions to which species are adapted. Climate is one of the major components driving species distributions. Therefore, it is expected that the plants coexisting in a community are reflective of the local environment, particularly climate.• Presented here is a method for the estimation of climate from local plant species coexistence data. The method, Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE), is a likelihood-based method that employs specimen collection data at a global scale for the inference of species climate tolerance. CRACLE calculates the maximum joint likelihood of coexistence given individual species climate tolerance characterization to estimate the expected climate.• Plant distribution data for more than 4000 species were used to show that this method accurately infers expected climate profiles for 165 sites with diverse climatic conditions. Estimates differ from the WorldClim global climate model by less than 1.5°C on average for mean annual temperature and less than ∼250 mm for mean annual precipitation. This is a significant improvement upon other plant-based climate-proxy methods.• CRACLE validates long hypothesized interactions between climate and local associations of plant species. Furthermore, CRACLE successfully estimates climate that is consistent with the widely used WorldClim model and therefore may be applied to the quantitative estimation of paleoclimate in future studies. © 2015 Botanical Society of America, Inc.
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions
Barrett, Harrison H.; Dainty, Christopher; Lara, David
2008-01-01
Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
Population Synthesis of Radio and Gamma-ray Pulsars using the Maximum Likelihood Approach
NASA Astrophysics Data System (ADS)
Billman, Caleb; Gonthier, P. L.; Harding, A. K.
2012-01-01
We present the results of a pulsar population synthesis of normal pulsars from the Galactic disk using a maximum likelihood method. We seek to maximize the likelihood of a set of parameters in a Monte Carlo population statistics code to better understand their uncertainties and the confidence region of the model's parameter space. The maximum likelihood method allows for the use of more applicable Poisson statistics in the comparison of distributions of small numbers of detected gamma-ray and radio pulsars. Our code simulates pulsars at birth using Monte Carlo techniques and evolves them to the present assuming initial spatial, kick velocity, magnetic field, and period distributions. Pulsars are spun down to the present and given radio and gamma-ray emission characteristics. We select measured distributions of radio pulsars from the Parkes Multibeam survey and Fermi gamma-ray pulsars to perform a likelihood analysis of the assumed model parameters such as initial period and magnetic field, and radio luminosity. We present the results of a grid search of the parameter space as well as a search for the maximum likelihood using a Markov Chain Monte Carlo method. We express our gratitude for the generous support of the Michigan Space Grant Consortium, of the National Science Foundation (REU and RUI), the NASA Astrophysics Theory and Fundamental Program and the NASA Fermi Guest Investigator Program.
Planning, Execution, and Assessment of Effects-Based Operations (EBO)
2006-05-01
time of execution that would maximize the likelihood of achieving a desired effect. GMU has developed a methodology, named ECAD -EA (Effective...Algorithm EBO Effects Based Operations ECAD -EA Effective Course of Action-Evolutionary Algorithm GMU George Mason University GUI Graphical...Probability Profile Generation ........................................................72 A.2.11 Running ECAD -EA (Effective Courses of Action Determination
USDA-ARS?s Scientific Manuscript database
Climate matching between the native and adventive ranges of insects used for biological control is a generally accepted strategy for both increasing the likelihood of establishing an agent, as well as improving its overall performance, thereby maximizing the potential utility of an agent across the...
ERIC Educational Resources Information Center
Köse, Alper
2014-01-01
The primary objective of this study was to examine the effect of missing data on goodness of fit statistics in confirmatory factor analysis (CFA). For this aim, four missing data handling methods; listwise deletion, full information maximum likelihood, regression imputation and expectation maximization (EM) imputation were examined in terms of…
Policy Implications Analysis: A Methodological Advancement for Policy Research and Evaluation.
ERIC Educational Resources Information Center
Madey, Doren L.; Stenner, A. Jackson
Policy Implications Analysis (PIA) is a tool designed to maximize the likelihood that an evaluation report will have an impact on decision-making. PIA was designed to help people planning and conducting evaluations tailor their information so that it has optimal potential for being used and acted upon. This paper describes the development and…
ERIC Educational Resources Information Center
Han, Kyung T.; Guo, Fanmin
2014-01-01
The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
Constrained Maximum Likelihood Estimation for Two-Level Mean and Covariance Structure Models
ERIC Educational Resources Information Center
Bentler, Peter M.; Liang, Jiajuan; Tang, Man-Lai; Yuan, Ke-Hai
2011-01-01
Maximum likelihood is commonly used for the estimation of model parameters in the analysis of two-level structural equation models. Constraints on model parameters could be encountered in some situations such as equal factor loadings for different factors. Linear constraints are the most common ones and they are relatively easy to handle in…
ERIC Educational Resources Information Center
Kelderman, Henk
1992-01-01
Describes algorithms used in the computer program LOGIMO for obtaining maximum likelihood estimates of the parameters in loglinear models. These algorithms are also useful for the analysis of loglinear item-response theory models. Presents modified versions of the iterative proportional fitting and Newton-Raphson algorithms. Simulated data…
ERIC Educational Resources Information Center
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Joint estimation of preferential attachment and node fitness in growing complex networks
NASA Astrophysics Data System (ADS)
Pham, Thong; Sheridan, Paul; Shimodaira, Hidetoshi
2016-09-01
Complex network growth across diverse fields of science is hypothesized to be driven in the main by a combination of preferential attachment and node fitness processes. For measuring the respective influences of these processes, previous approaches make strong and untested assumptions on the functional forms of either the preferential attachment function or fitness function or both. We introduce a Bayesian statistical method called PAFit to estimate preferential attachment and node fitness without imposing such functional constraints that works by maximizing a log-likelihood function with suitably added regularization terms. We use PAFit to investigate the interplay between preferential attachment and node fitness processes in a Facebook wall-post network. While we uncover evidence for both preferential attachment and node fitness, thus validating the hypothesis that these processes together drive complex network evolution, we also find that node fitness plays the bigger role in determining the degree of a node. This is the first validation of its kind on real-world network data. But surprisingly the rate of preferential attachment is found to deviate from the conventional log-linear form when node fitness is taken into account. The proposed method is implemented in the R package PAFit.
Automatic Spike Sorting Using Tuning Information
Ventura, Valérie
2011-01-01
Current spike sorting methods focus on clustering neurons’ characteristic spike waveforms. The resulting spike-sorted data are typically used to estimate how covariates of interest modulate the firing rates of neurons. However, when these covariates do modulate the firing rates, they provide information about spikes’ identities, which thus far have been ignored for the purpose of spike sorting. This letter describes a novel approach to spike sorting, which incorporates both waveform information and tuning information obtained from the modulation of firing rates. Because it efficiently uses all the available information, this spike sorter yields lower spike misclassification rates than traditional automatic spike sorters. This theoretical result is verified empirically on several examples. The proposed method does not require additional assumptions; only its implementation is different. It essentially consists of performing spike sorting and tuning estimation simultaneously rather than sequentially, as is currently done. We used an expectation-maximization maximum likelihood algorithm to implement the new spike sorter. We present the general form of this algorithm and provide a detailed implementable version under the assumptions that neurons are independent and spike according to Poisson processes. Finally, we uncover a systematic flaw of spike sorting based on waveform information only. PMID:19548802
Automatic spike sorting using tuning information.
Ventura, Valérie
2009-09-01
Current spike sorting methods focus on clustering neurons' characteristic spike waveforms. The resulting spike-sorted data are typically used to estimate how covariates of interest modulate the firing rates of neurons. However, when these covariates do modulate the firing rates, they provide information about spikes' identities, which thus far have been ignored for the purpose of spike sorting. This letter describes a novel approach to spike sorting, which incorporates both waveform information and tuning information obtained from the modulation of firing rates. Because it efficiently uses all the available information, this spike sorter yields lower spike misclassification rates than traditional automatic spike sorters. This theoretical result is verified empirically on several examples. The proposed method does not require additional assumptions; only its implementation is different. It essentially consists of performing spike sorting and tuning estimation simultaneously rather than sequentially, as is currently done. We used an expectation-maximization maximum likelihood algorithm to implement the new spike sorter. We present the general form of this algorithm and provide a detailed implementable version under the assumptions that neurons are independent and spike according to Poisson processes. Finally, we uncover a systematic flaw of spike sorting based on waveform information only.
Learning multivariate distributions by competitive assembly of marginals.
Sánchez-Vega, Francisco; Younes, Laurent; Geman, Donald
2013-02-01
We present a new framework for learning high-dimensional multivariate probability distributions from estimated marginals. The approach is motivated by compositional models and Bayesian networks, and designed to adapt to small sample sizes. We start with a large, overlapping set of elementary statistical building blocks, or "primitives," which are low-dimensional marginal distributions learned from data. Each variable may appear in many primitives. Subsets of primitives are combined in a Lego-like fashion to construct a probabilistic graphical model; only a small fraction of the primitives will participate in any valid construction. Since primitives can be precomputed, parameter estimation and structure search are separated. Model complexity is controlled by strong biases; we adapt the primitives to the amount of training data and impose rules which restrict the merging of them into allowable compositions. The likelihood of the data decomposes into a sum of local gains, one for each primitive in the final structure. We focus on a specific subclass of networks which are binary forests. Structure optimization corresponds to an integer linear program and the maximizing composition can be computed for reasonably large numbers of variables. Performance is evaluated using both synthetic data and real datasets from natural language processing and computational biology.
Lefkimmiatis, Stamatios; Maragos, Petros; Papandreou, George
2009-08-01
We present an improved statistical model for analyzing Poisson processes, with applications to photon-limited imaging. We build on previous work, adopting a multiscale representation of the Poisson process in which the ratios of the underlying Poisson intensities (rates) in adjacent scales are modeled as mixtures of conjugate parametric distributions. Our main contributions include: 1) a rigorous and robust regularized expectation-maximization (EM) algorithm for maximum-likelihood estimation of the rate-ratio density parameters directly from the noisy observed Poisson data (counts); 2) extension of the method to work under a multiscale hidden Markov tree model (HMT) which couples the mixture label assignments in consecutive scales, thus modeling interscale coefficient dependencies in the vicinity of image edges; 3) exploration of a 2-D recursive quad-tree image representation, involving Dirichlet-mixture rate-ratio densities, instead of the conventional separable binary-tree image representation involving beta-mixture rate-ratio densities; and 4) a novel multiscale image representation, which we term Poisson-Haar decomposition, that better models the image edge structure, thus yielding improved performance. Experimental results on standard images with artificially simulated Poisson noise and on real photon-limited images demonstrate the effectiveness of the proposed techniques.
Xu, Jason; Guttorp, Peter; Kato-Maeda, Midori; Minin, Vladimir N
2015-12-01
Continuous-time birth-death-shift (BDS) processes are frequently used in stochastic modeling, with many applications in ecology and epidemiology. In particular, such processes can model evolutionary dynamics of transposable elements-important genetic markers in molecular epidemiology. Estimation of the effects of individual covariates on the birth, death, and shift rates of the process can be accomplished by analyzing patient data, but inferring these rates in a discretely and unevenly observed setting presents computational challenges. We propose a multi-type branching process approximation to BDS processes and develop a corresponding expectation maximization algorithm, where we use spectral techniques to reduce calculation of expected sufficient statistics to low-dimensional integration. These techniques yield an efficient and robust optimization routine for inferring the rates of the BDS process, and apply broadly to multi-type branching processes whose rates can depend on many covariates. After rigorously testing our methodology in simulation studies, we apply our method to study intrapatient time evolution of IS6110 transposable element, a genetic marker frequently used during estimation of epidemiological clusters of Mycobacterium tuberculosis infections. © 2015, The International Biometric Society.
Object Detection in Natural Backgrounds Predicted by Discrimination Performance and Models
NASA Technical Reports Server (NTRS)
Ahumada, A. J., Jr.; Watson, A. B.; Rohaly, A. M.; Null, Cynthia H. (Technical Monitor)
1995-01-01
In object detection, an observer looks for an object class member in a set of backgrounds. In discrimination, an observer tries to distinguish two images. Discrimination models predict the probability that an observer detects a difference between two images. We compare object detection and image discrimination with the same stimuli by: (1) making stimulus pairs of the same background with and without the target object and (2) either giving many consecutive trials with the same background (discrimination) or intermixing the stimuli (object detection). Six images of a vehicle in a natural setting were altered to remove the vehicle and mixed with the original image in various proportions. Detection observers rated the images for vehicle presence. Discrimination observers rated the images for any difference from the background image. Estimated detectabilities of the vehicles were found by maximizing the likelihood of a Thurstone category scaling model. The pattern of estimated detectabilities is similar for discrimination and object detection, and is accurately predicted by a Cortex Transform discrimination model. Predictions of a Contrast- Sensitivity- Function filter model and a Root-Mean-Square difference metric based on the digital image values are less accurate. The discrimination detectabilities averaged about twice those of object detection.
Joint estimation of preferential attachment and node fitness in growing complex networks
Pham, Thong; Sheridan, Paul; Shimodaira, Hidetoshi
2016-01-01
Complex network growth across diverse fields of science is hypothesized to be driven in the main by a combination of preferential attachment and node fitness processes. For measuring the respective influences of these processes, previous approaches make strong and untested assumptions on the functional forms of either the preferential attachment function or fitness function or both. We introduce a Bayesian statistical method called PAFit to estimate preferential attachment and node fitness without imposing such functional constraints that works by maximizing a log-likelihood function with suitably added regularization terms. We use PAFit to investigate the interplay between preferential attachment and node fitness processes in a Facebook wall-post network. While we uncover evidence for both preferential attachment and node fitness, thus validating the hypothesis that these processes together drive complex network evolution, we also find that node fitness plays the bigger role in determining the degree of a node. This is the first validation of its kind on real-world network data. But surprisingly the rate of preferential attachment is found to deviate from the conventional log-linear form when node fitness is taken into account. The proposed method is implemented in the R package PAFit. PMID:27601314
Bayesian logistic regression approaches to predict incorrect DRG assignment.
Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural
2018-05-07
Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Uszko-Lencer, Nicole H M K; Mesquita, Rafael; Janssen, Eefje; Werter, Christ; Brunner-La Rocca, Hans-Peter; Pitta, Fabio; Wouters, Emiel F M; Spruit, Martijn A
2017-08-01
In-depth analyses of the measurement properties of the 6-minute walk test (6MWT) in patients with chronic heart failure (CHF) are lacking. We investigated the reliability, construct validity, and determinants of the distance covered in the 6MWT (6MWD) in CHF patients. 337 patients were studied (median age 65years, 70% male, ejection fraction 35%). Participants performed two 6MWTs on subsequent days. Demographics, anthropometrics, clinical data, ejection fraction, maximal exercise capacity, body composition, lung function, and symptoms of anxiety and depression were also assessed. Construct validity was assessed in terms of convergent, discriminant and known-groups validity. Stepwise linear regression was used. 6MWT was reliable (ICC=0.90, P<0.0001). The learning effect was 31m (95%CI 27, 35m). Older age (≥65years), lower lung diffusing capacity (<80% predicted) and higher NYHA class (NYHA III) were associated with a lower likelihood of a meaningful increase in the second test (OR 0.45-0.56, P<0.05 for all). The best 6MWD had moderate-to-good correlations with peak exercise capacity (r s =0.54-0.69) and no-to-fair correlations with body composition, lung function, ejection fraction, and symptoms of anxiety and depression (r s =0.04-0.49). Patients with higher NYHA classes had lower 6MWD. 6MWD was independently associated with maximal power output during maximal exercise, estimated glomerular filtration rate and age (51.7% of the variability). 6MWT was found to be reliable and valid in patients with mild-to-moderate CHF. Maximal exercise capacity, renal function and age were significant determinants of the best 6MWD. These findings strengthen the clinical utility of the 6MWT in CHF. Copyright © 2017 Elsevier B.V. All rights reserved.
Lin, Feng-Chang; Zhu, Jun
2012-01-01
We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers
Runkel, Robert L.; Crawford, Charles G.; Cohn, Timothy A.
2004-01-01
LOAD ESTimator (LOADEST) is a FORTRAN program for estimating constituent loads in streams and rivers. Given a time series of streamflow, additional data variables, and constituent concentration, LOADEST assists the user in developing a regression model for the estimation of constituent load (calibration). Explanatory variables within the regression model include various functions of streamflow, decimal time, and additional user-specified data variables. The formulated regression model then is used to estimate loads over a user-specified time interval (estimation). Mean load estimates, standard errors, and 95 percent confidence intervals are developed on a monthly and(or) seasonal basis. The calibration and estimation procedures within LOADEST are based on three statistical estimation methods. The first two methods, Adjusted Maximum Likelihood Estimation (AMLE) and Maximum Likelihood Estimation (MLE), are appropriate when the calibration model errors (residuals) are normally distributed. Of the two, AMLE is the method of choice when the calibration data set (time series of streamflow, additional data variables, and concentration) contains censored data. The third method, Least Absolute Deviation (LAD), is an alternative to maximum likelihood estimation when the residuals are not normally distributed. LOADEST output includes diagnostic tests and warnings to assist the user in determining the appropriate estimation method and in interpreting the estimated loads. This report describes the development and application of LOADEST. Sections of the report describe estimation theory, input/output specifications, sample applications, and installation instructions.
Maximum Likelihood Shift Estimation Using High Resolution Polarimetric SAR Clutter Model
NASA Astrophysics Data System (ADS)
Harant, Olivier; Bombrun, Lionel; Vasile, Gabriel; Ferro-Famil, Laurent; Gay, Michel
2011-03-01
This paper deals with a Maximum Likelihood (ML) shift estimation method in the context of High Resolution (HR) Polarimetric SAR (PolSAR) clutter. Texture modeling is exposed and the generalized ML texture tracking method is extended to the merging of various sensors. Some results on displacement estimation on the Argentiere glacier in the Mont Blanc massif using dual-pol TerraSAR-X (TSX) and quad-pol RADARSAT-2 (RS2) sensors are finally discussed.
Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley
2013-12-15
The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
Nonparametric probability density estimation by optimization theoretic techniques
NASA Technical Reports Server (NTRS)
Scott, D. W.
1976-01-01
Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.
A quasi-likelihood approach to non-negative matrix factorization
Devarajan, Karthik; Cheung, Vincent C.K.
2017-01-01
A unified approach to non-negative matrix factorization based on the theory of generalized linear models is proposed. This approach embeds a variety of statistical models, including the exponential family, within a single theoretical framework and provides a unified view of such factorizations from the perspective of quasi-likelihood. Using this framework, a family of algorithms for handling signal-dependent noise is developed and its convergence proven using the Expectation-Maximization algorithm. In addition, a measure to evaluate the goodness-of-fit of the resulting factorization is described. The proposed methods allow modeling of non-linear effects via appropriate link functions and are illustrated using an application in biomedical signal processing. PMID:27348511
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, M.
1980-12-01
The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that themore » use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates.« less
Cosmological parameter estimation using Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Prasad, J.; Souradeep, T.
2014-03-01
Constraining parameters of a theoretical model from observational data is an important exercise in cosmology. There are many theoretically motivated models, which demand greater number of cosmological parameters than the standard model of cosmology uses, and make the problem of parameter estimation challenging. It is a common practice to employ Bayesian formalism for parameter estimation for which, in general, likelihood surface is probed. For the standard cosmological model with six parameters, likelihood surface is quite smooth and does not have local maxima, and sampling based methods like Markov Chain Monte Carlo (MCMC) method are quite successful. However, when there are a large number of parameters or the likelihood surface is not smooth, other methods may be more effective. In this paper, we have demonstrated application of another method inspired from artificial intelligence, called Particle Swarm Optimization (PSO) for estimating cosmological parameters from Cosmic Microwave Background (CMB) data taken from the WMAP satellite.
The Extended-Image Tracking Technique Based on the Maximum Likelihood Estimation
NASA Technical Reports Server (NTRS)
Tsou, Haiping; Yan, Tsun-Yee
2000-01-01
This paper describes an extended-image tracking technique based on the maximum likelihood estimation. The target image is assume to have a known profile covering more than one element of a focal plane detector array. It is assumed that the relative position between the imager and the target is changing with time and the received target image has each of its pixels disturbed by an independent additive white Gaussian noise. When a rotation-invariant movement between imager and target is considered, the maximum likelihood based image tracking technique described in this paper is a closed-loop structure capable of providing iterative update of the movement estimate by calculating the loop feedback signals from a weighted correlation between the currently received target image and the previously estimated reference image in the transform domain. The movement estimate is then used to direct the imager to closely follow the moving target. This image tracking technique has many potential applications, including free-space optical communications and astronomy where accurate and stabilized optical pointing is essential.
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles
2010-01-01
In this article the authors focus on the issue of the nonuniqueness of the maximum likelihood (ML) estimator of proficiency level in item response theory (with special attention to logistic models). The usual maximum a posteriori (MAP) method offers a good alternative within that framework; however, this article highlights some drawbacks of its…
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.
Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R
2018-05-26
Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis
van de Schoot, Rens; Hox, Joop
2014-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions. PMID:29795827
How much to trust the senses: Likelihood learning
Sato, Yoshiyuki; Kording, Konrad P.
2014-01-01
Our brain often needs to estimate unknown variables from imperfect information. Our knowledge about the statistical distributions of quantities in our environment (called priors) and currently available information from sensory inputs (called likelihood) are the basis of all Bayesian models of perception and action. While we know that priors are learned, most studies of prior-likelihood integration simply assume that subjects know about the likelihood. However, as the quality of sensory inputs change over time, we also need to learn about new likelihoods. Here, we show that human subjects readily learn the distribution of visual cues (likelihood function) in a way that can be predicted by models of statistically optimal learning. Using a likelihood that depended on color context, we found that a learned likelihood generalized to new priors. Thus, we conclude that subjects learn about likelihood. PMID:25398975
Julien, Clavel; Leandro, Aristide; Hélène, Morlon
2018-06-19
Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Method and system for diagnostics of apparatus
NASA Technical Reports Server (NTRS)
Gorinevsky, Dimitry (Inventor)
2012-01-01
Proposed is a method, implemented in software, for estimating fault state of an apparatus outfitted with sensors. At each execution period the method processes sensor data from the apparatus to obtain a set of parity parameters, which are further used for estimating fault state. The estimation method formulates a convex optimization problem for each fault hypothesis and employs a convex solver to compute fault parameter estimates and fault likelihoods for each fault hypothesis. The highest likelihoods and corresponding parameter estimates are transmitted to a display device or an automated decision and control system. The obtained accurate estimate of fault state can be used to improve safety, performance, or maintenance processes for the apparatus.
Likelihood ratios for glaucoma diagnosis using spectral-domain optical coherence tomography.
Lisboa, Renato; Mansouri, Kaweh; Zangwill, Linda M; Weinreb, Robert N; Medeiros, Felipe A
2013-11-01
To present a methodology for calculating likelihood ratios for glaucoma diagnosis for continuous retinal nerve fiber layer (RNFL) thickness measurements from spectral-domain optical coherence tomography (spectral-domain OCT). Observational cohort study. A total of 262 eyes of 187 patients with glaucoma and 190 eyes of 100 control subjects were included in the study. Subjects were recruited from the Diagnostic Innovations Glaucoma Study. Eyes with preperimetric and perimetric glaucomatous damage were included in the glaucoma group. The control group was composed of healthy eyes with normal visual fields from subjects recruited from the general population. All eyes underwent RNFL imaging with Spectralis spectral-domain OCT. Likelihood ratios for glaucoma diagnosis were estimated for specific global RNFL thickness measurements using a methodology based on estimating the tangents to the receiver operating characteristic (ROC) curve. Likelihood ratios could be determined for continuous values of average RNFL thickness. Average RNFL thickness values lower than 86 μm were associated with positive likelihood ratios (ie, likelihood ratios greater than 1), whereas RNFL thickness values higher than 86 μm were associated with negative likelihood ratios (ie, likelihood ratios smaller than 1). A modified Fagan nomogram was provided to assist calculation of posttest probability of disease from the calculated likelihood ratios and pretest probability of disease. The methodology allowed calculation of likelihood ratios for specific RNFL thickness values. By avoiding arbitrary categorization of test results, it potentially allows for an improved integration of test results into diagnostic clinical decision making. Copyright © 2013. Published by Elsevier Inc.
ERIC Educational Resources Information Center
Wothke, Werner; Burket, George; Chen, Li-Sue; Gao, Furong; Shu, Lianghua; Chia, Mike
2011-01-01
It has been known for some time that item response theory (IRT) models may exhibit a likelihood function of a respondent's ability which may have multiple modes, flat modes, or both. These conditions, often associated with guessing of multiple-choice (MC) questions, can introduce uncertainty and bias to ability estimation by maximum likelihood…
F-8C adaptive flight control extensions. [for maximum likelihood estimation
NASA Technical Reports Server (NTRS)
Stein, G.; Hartmann, G. L.
1977-01-01
An adaptive concept which combines gain-scheduled control laws with explicit maximum likelihood estimation (MLE) identification to provide the scheduling values is described. The MLE algorithm was improved by incorporating attitude data, estimating gust statistics for setting filter gains, and improving parameter tracking during changing flight conditions. A lateral MLE algorithm was designed to improve true air speed and angle of attack estimates during lateral maneuvers. Relationships between the pitch axis sensors inherent in the MLE design were examined and used for sensor failure detection. Design details and simulation performance are presented for each of the three areas investigated.
Quantum state estimation when qubits are lost: a no-data-left-behind approach
Williams, Brian P.; Lougovski, Pavel
2017-04-06
We present an approach to Bayesian mean estimation of quantum states using hyperspherical parametrization and an experiment-specific likelihood which allows utilization of all available data, even when qubits are lost. With this method, we report the first closed-form Bayesian mean and maximum likelihood estimates for the ideal single qubit. Due to computational constraints, we utilize numerical sampling to determine the Bayesian mean estimate for a photonic two-qubit experiment in which our novel analysis reduces burdens associated with experimental asymmetries and inefficiencies. This method can be applied to quantum states of any dimension and experimental complexity.
Eisenhauer, Philipp; Heckman, James J.; Mosso, Stefano
2015-01-01
We compare the performance of maximum likelihood (ML) and simulated method of moments (SMM) estimation for dynamic discrete choice models. We construct and estimate a simplified dynamic structural model of education that captures some basic features of educational choices in the United States in the 1980s and early 1990s. We use estimates from our model to simulate a synthetic dataset and assess the ability of ML and SMM to recover the model parameters on this sample. We investigate the performance of alternative tuning parameters for SMM. PMID:26494926
Televised antismoking advertising: effects of level and duration of exposure.
Dunlop, Sally; Cotter, Trish; Perez, Donna; Wakefield, Melanie
2013-08-01
We assessed the effects of levels and duration of exposure to televised antismoking advertising on cognitive and behavioral changes. We used data from a serial cross-sectional telephone survey with weekly interviews of adult smokers and recent quitters in New South Wales, Australia (n = 13,301), between April 2005 and December 2010. We merged survey data with commercial TV ratings data to estimate individuals' exposure to antismoking advertising. Logistic regression analyses indicated that after adjustment for a wide range of potential confounders, exposure to antismoking advertising at levels between 100 and 200 gross rating points per week on average over 6 to 9 weeks was associated with an increased likelihood of having (1) salient quitting thoughts and (2) recent quit attempts. Associations between exposure for shorter periods and these outcomes were not significant. Broadcasting schedules may affect the success of antismoking ads. Campaign planners should ensure advertising exposure at adequate frequency over relatively sustained periods to maximize impact.
Multivariate longitudinal data analysis with censored and intermittent missing responses.
Lin, Tsung-I; Lachos, Victor H; Wang, Wan-Lun
2018-05-08
The multivariate linear mixed model (MLMM) has emerged as an important analytical tool for longitudinal data with multiple outcomes. However, the analysis of multivariate longitudinal data could be complicated by the presence of censored measurements because of a detection limit of the assay in combination with unavoidable missing values arising when subjects miss some of their scheduled visits intermittently. This paper presents a generalization of the MLMM approach, called the MLMM-CM, for a joint analysis of the multivariate longitudinal data with censored and intermittent missing responses. A computationally feasible expectation maximization-based procedure is developed to carry out maximum likelihood estimation within the MLMM-CM framework. Moreover, the asymptotic standard errors of fixed effects are explicitly obtained via the information-based method. We illustrate our methodology by using simulated data and a case study from an AIDS clinical trial. Experimental results reveal that the proposed method is able to provide more satisfactory performance as compared with the traditional MLMM approach. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Zheng, Qiang; Li, Honglun; Fan, Baode; Wu, Shuanhu; Xu, Jindong
2017-12-01
Active contour model (ACM) has been one of the most widely utilized methods in magnetic resonance (MR) brain image segmentation because of its ability of capturing topology changes. However, most of the existing ACMs only consider single-slice information in MR brain image data, i.e., the information used in ACMs based segmentation method is extracted only from one slice of MR brain image, which cannot take full advantage of the adjacent slice images' information, and cannot satisfy the local segmentation of MR brain images. In this paper, a novel ACM is proposed to solve the problem discussed above, which is based on multi-variate local Gaussian distribution and combines the adjacent slice images' information in MR brain image data to satisfy segmentation. The segmentation is finally achieved through maximizing the likelihood estimation. Experiments demonstrate the advantages of the proposed ACM over the single-slice ACM in local segmentation of MR brain image series.
Televised Antismoking Advertising: Effects of Level and Duration of Exposure
Cotter, Trish; Perez, Donna; Wakefield, Melanie
2013-01-01
Objectives. We assessed the effects of levels and duration of exposure to televised antismoking advertising on cognitive and behavioral changes. Methods. We used data from a serial cross-sectional telephone survey with weekly interviews of adult smokers and recent quitters in New South Wales, Australia (n = 13 301), between April 2005 and December 2010. We merged survey data with commercial TV ratings data to estimate individuals’ exposure to antismoking advertising. Results. Logistic regression analyses indicated that after adjustment for a wide range of potential confounders, exposure to antismoking advertising at levels between 100 and 200 gross rating points per week on average over 6 to 9 weeks was associated with an increased likelihood of having (1) salient quitting thoughts and (2) recent quit attempts. Associations between exposure for shorter periods and these outcomes were not significant. Conclusions. Broadcasting schedules may affect the success of antismoking ads. Campaign planners should ensure advertising exposure at adequate frequency over relatively sustained periods to maximize impact. PMID:23763419
Garriguet, Didier
2016-04-01
Estimates of the prevalence of adherence to physical activity guidelines in the population are generally the result of averaging individual probability of adherence based on the number of days people meet the guidelines and the number of days they are assessed. Given this number of active and inactive days (days assessed minus days active), the conditional probability of meeting the guidelines that has been used in the past is a Beta (1 + active days, 1 + inactive days) distribution assuming the probability p of a day being active is bounded by 0 and 1 and averages 50%. A change in the assumption about the distribution of p is required to better match the discrete nature of the data and to better assess the probability of adherence when the percentage of active days in the population differs from 50%. Using accelerometry data from the Canadian Health Measures Survey, the probability of adherence to physical activity guidelines is estimated using a conditional probability given the number of active and inactive days distributed as a Betabinomial(n, a + active days , β + inactive days) assuming that p is randomly distributed as Beta(a, β) where the parameters a and β are estimated by maximum likelihood. The resulting Betabinomial distribution is discrete. For children aged 6 or older, the probability of meeting physical activity guidelines 7 out of 7 days is similar to published estimates. For pre-schoolers, the Betabinomial distribution yields higher estimates of adherence to the guidelines than the Beta distribution, in line with the probability of being active on any given day. In estimating the probability of adherence to physical activity guidelines, the Betabinomial distribution has several advantages over the previously used Beta distribution. It is a discrete distribution and maximizes the richness of accelerometer data.
NASA Astrophysics Data System (ADS)
Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho
2017-03-01
So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.
On the Existence and Uniqueness of JML Estimates for the Partial Credit Model
ERIC Educational Resources Information Center
Bertoli-Barsotti, Lucio
2005-01-01
A necessary and sufficient condition is given in this paper for the existence and uniqueness of the maximum likelihood (the so-called joint maximum likelihood) estimate of the parameters of the Partial Credit Model. This condition is stated in terms of a structural property of the pattern of the data matrix that can be easily verified on the basis…
ERIC Educational Resources Information Center
Paek, Insu; Wilson, Mark
2011-01-01
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
Use of Bayes theorem to correct size-specific sampling bias in growth data.
Troynikov, V S
1999-03-01
The bayesian decomposition of posterior distribution was used to develop a likelihood function to correct bias in the estimates of population parameters from data collected randomly with size-specific selectivity. Positive distributions with time as a parameter were used for parametrization of growth data. Numerical illustrations are provided. The alternative applications of the likelihood to estimate selectivity parameters are discussed.
ATAC Autocuer Modeling Analysis.
1981-01-01
the analysis of the simple rectangular scrnentation (1) is based on detection and estimation theory (2). This approach uses the concept of maximum ...continuous wave forms. In order to develop the principles of maximum likelihood, it is con- venient to develop the principles for the "classical...the concept of maximum likelihood is significant in that it provides the optimum performance of the detection/estimation problem. With a knowledge of
Multibaseline gravitational wave radiometry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Talukder, Dipongkar; Bose, Sukanta; Mitra, Sanjit
2011-03-15
We present a statistic for the detection of stochastic gravitational wave backgrounds (SGWBs) using radiometry with a network of multiple baselines. We also quantitatively compare the sensitivities of existing baselines and their network to SGWBs. We assess how the measurement accuracy of signal parameters, e.g., the sky position of a localized source, can improve when using a network of baselines, as compared to any of the single participating baselines. The search statistic itself is derived from the likelihood ratio of the cross correlation of the data across all possible baselines in a detector network and is optimal in Gaussian noise.more » Specifically, it is the likelihood ratio maximized over the strength of the SGWB and is called the maximized-likelihood ratio (MLR). One of the main advantages of using the MLR over past search strategies for inferring the presence or absence of a signal is that the former does not require the deconvolution of the cross correlation statistic. Therefore, it does not suffer from errors inherent to the deconvolution procedure and is especially useful for detecting weak sources. In the limit of a single baseline, it reduces to the detection statistic studied by Ballmer [Classical Quantum Gravity 23, S179 (2006).] and Mitra et al.[Phys. Rev. D 77, 042002 (2008).]. Unlike past studies, here the MLR statistic enables us to compare quantitatively the performances of a variety of baselines searching for a SGWB signal in (simulated) data. Although we use simulated noise and SGWB signals for making these comparisons, our method can be straightforwardly applied on real data.« less
Heersink, Daniel K; Caley, Peter; Paini, Dean R; Barry, Simon C
2016-05-01
The cost of an uncontrolled incursion of invasive alien species (IAS) arising from undetected entry through ports can be substantial, and knowledge of port-specific risks is needed to help allocate limited surveillance resources. Quantifying the establishment likelihood of such an incursion requires quantifying the ability of a species to enter, establish, and spread. Estimation of the approach rate of IAS into ports provides a measure of likelihood of entry. Data on the approach rate of IAS are typically sparse, and the combinations of risk factors relating to country of origin and port of arrival diverse. This presents challenges to making formal statistical inference on establishment likelihood. Here we demonstrate how these challenges can be overcome with judicious use of mixed-effects models when estimating the incursion likelihood into Australia of the European (Apis mellifera) and Asian (A. cerana) honeybees, along with the invasive parasites of biosecurity concern they host (e.g., Varroa destructor). Our results demonstrate how skewed the establishment likelihood is, with one-tenth of the ports accounting for 80% or more of the likelihood for both species. These results have been utilized by biosecurity agencies in the allocation of resources to the surveillance of maritime ports. © 2015 Society for Risk Analysis.
Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.
Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu
2017-10-03
Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.
Forecasting financial asset processes: stochastic dynamics via learning neural networks.
Giebel, S; Rainer, M
2010-01-01
Models for financial asset dynamics usually take into account their inherent unpredictable nature by including a suitable stochastic component into their process. Unknown (forward) values of financial assets (at a given time in the future) are usually estimated as expectations of the stochastic asset under a suitable risk-neutral measure. This estimation requires the stochastic model to be calibrated to some history of sufficient length in the past. Apart from inherent limitations, due to the stochastic nature of the process, the predictive power is also limited by the simplifying assumptions of the common calibration methods, such as maximum likelihood estimation and regression methods, performed often without weights on the historic time series, or with static weights only. Here we propose a novel method of "intelligent" calibration, using learning neural networks in order to dynamically adapt the parameters of the stochastic model. Hence we have a stochastic process with time dependent parameters, the dynamics of the parameters being themselves learned continuously by a neural network. The back propagation in training the previous weights is limited to a certain memory length (in the examples we consider 10 previous business days), which is similar to the maximal time lag of autoregressive processes. We demonstrate the learning efficiency of the new algorithm by tracking the next-day forecasts for the EURTRY and EUR-HUF exchange rates each.
An evaluation of portion size estimation aids: precision, ease of use and likelihood of future use.
Faulkner, Gemma P; Livingstone, M Barbara E; Pourshahidi, L Kirsty; Spence, Michelle; Dean, Moira; O'Brien, Sinead; Gibney, Eileen R; Wallace, Julie Mw; McCaffrey, Tracy A; Kerr, Maeve A
2016-09-01
The present study aimed to evaluate the precision, ease of use and likelihood of future use of portion size estimation aids (PSEA). A range of PSEA were used to estimate the serving sizes of a range of commonly eaten foods and rated for ease of use and likelihood of future usage. For each food, participants selected their preferred PSEA from a range of options including: quantities and measures; reference objects; measuring; and indicators on food packets. These PSEA were used to serve out various foods (e.g. liquid, amorphous, and composite dishes). Ease of use and likelihood of future use were noted. The foods were weighed to determine the precision of each PSEA. Males and females aged 18-64 years (n 120). The quantities and measures were the most precise PSEA (lowest range of weights for estimated portion sizes). However, participants preferred household measures (e.g. 200 ml disposable cup) - deemed easy to use (median rating of 5), likely to use again in future (all scored either 4 or 5 on a scale from 1='not very likely' to 5='very likely to use again') and precise (narrow range of weights for estimated portion sizes). The majority indicated they would most likely use the PSEA preparing a meal (94 %), particularly dinner (86 %) in the home (89 %; all P<0·001) for amorphous grain foods. Household measures may be precise, easy to use and acceptable aids for estimating the appropriate portion size of amorphous grain foods.
Empirical likelihood inference in randomized clinical trials.
Zhang, Biao
2017-01-01
In individually randomized controlled trials, in addition to the primary outcome, information is often available on a number of covariates prior to randomization. This information is frequently utilized to undertake adjustment for baseline characteristics in order to increase precision of the estimation of average treatment effects; such adjustment is usually performed via covariate adjustment in outcome regression models. Although the use of covariate adjustment is widely seen as desirable for making treatment effect estimates more precise and the corresponding hypothesis tests more powerful, there are considerable concerns that objective inference in randomized clinical trials can potentially be compromised. In this paper, we study an empirical likelihood approach to covariate adjustment and propose two unbiased estimating functions that automatically decouple evaluation of average treatment effects from regression modeling of covariate-outcome relationships. The resulting empirical likelihood estimator of the average treatment effect is as efficient as the existing efficient adjusted estimators 1 when separate treatment-specific working regression models are correctly specified, yet are at least as efficient as the existing efficient adjusted estimators 1 for any given treatment-specific working regression models whether or not they coincide with the true treatment-specific covariate-outcome relationships. We present a simulation study to compare the finite sample performance of various methods along with some results on analysis of a data set from an HIV clinical trial. The simulation results indicate that the proposed empirical likelihood approach is more efficient and powerful than its competitors when the working covariate-outcome relationships by treatment status are misspecified.
Development of advanced techniques for rotorcraft state estimation and parameter identification
NASA Technical Reports Server (NTRS)
Hall, W. E., Jr.; Bohn, J. G.; Vincent, J. H.
1980-01-01
An integrated methodology for rotorcraft system identification consists of rotorcraft mathematical modeling, three distinct data processing steps, and a technique for designing inputs to improve the identifiability of the data. These elements are as follows: (1) a Kalman filter smoother algorithm which estimates states and sensor errors from error corrupted data. Gust time histories and statistics may also be estimated; (2) a model structure estimation algorithm for isolating a model which adequately explains the data; (3) a maximum likelihood algorithm for estimating the parameters and estimates for the variance of these estimates; and (4) an input design algorithm, based on a maximum likelihood approach, which provides inputs to improve the accuracy of parameter estimates. Each step is discussed with examples to both flight and simulated data cases.
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
Robust analysis of semiparametric renewal process models
Lin, Feng-Chang; Truong, Young K.; Fine, Jason P.
2013-01-01
Summary A rate model is proposed for a modulated renewal process comprising a single long sequence, where the covariate process may not capture the dependencies in the sequence as in standard intensity models. We consider partial likelihood-based inferences under a semiparametric multiplicative rate model, which has been widely studied in the context of independent and identical data. Under an intensity model, gap times in a single long sequence may be used naively in the partial likelihood with variance estimation utilizing the observed information matrix. Under a rate model, the gap times cannot be treated as independent and studying the partial likelihood is much more challenging. We employ a mixing condition in the application of limit theory for stationary sequences to obtain consistency and asymptotic normality. The estimator's variance is quite complicated owing to the unknown gap times dependence structure. We adapt block bootstrapping and cluster variance estimators to the partial likelihood. Simulation studies and an analysis of a semiparametric extension of a popular model for neural spike train data demonstrate the practical utility of the rate approach in comparison with the intensity approach. PMID:24550568
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics.
Arampatzis, Georgios; Katsoulakis, Markos A; Rey-Bellet, Luc
2016-03-14
We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systems with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics
NASA Astrophysics Data System (ADS)
Arampatzis, Georgios; Katsoulakis, Markos A.; Rey-Bellet, Luc
2016-03-01
We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systems with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arampatzis, Georgios; Katsoulakis, Markos A.; Rey-Bellet, Luc
2016-03-14
We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systemsmore » with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.« less
Maximum likelihood density modification by pattern recognition of structural motifs
Terwilliger, Thomas C.
2004-04-13
An electron density for a crystallographic structure having protein regions and solvent regions is improved by maximizing the log likelihood of a set of structures factors {F.sub.h } using a local log-likelihood function: (x)+p(.rho.(x).vertline.SOLV)p.sub.SOLV (x)+p(.rho.(x).vertline.H)p.sub.H (x)], where p.sub.PROT (x) is the probability that x is in the protein region, p(.rho.(x).vertline.PROT) is the conditional probability for .rho.(x) given that x is in the protein region, and p.sub.SOLV (x) and p(.rho.(x).vertline.SOLV) are the corresponding quantities for the solvent region, p.sub.H (x) refers to the probability that there is a structural motif at a known location, with a known orientation, in the vicinity of the point x; and p(.rho.(x).vertline.H) is the probability distribution for electron density at this point given that the structural motif actually is present. One appropriate structural motif is a helical structure within the crystallographic structure.
SPECT reconstruction using DCT-induced tight framelet regularization
NASA Astrophysics Data System (ADS)
Zhang, Jiahan; Li, Si; Xu, Yuesheng; Schmidtlein, C. R.; Lipson, Edward D.; Feiglin, David H.; Krol, Andrzej
2015-03-01
Wavelet transforms have been successfully applied in many fields of image processing. Yet, to our knowledge, they have never been directly incorporated to the objective function in Emission Computed Tomography (ECT) image reconstruction. Our aim has been to investigate if the ℓ1-norm of non-decimated discrete cosine transform (DCT) coefficients of the estimated radiotracer distribution could be effectively used as the regularization term for the penalized-likelihood (PL) reconstruction, where a regularizer is used to enforce the image smoothness in the reconstruction. In this study, the ℓ1-norm of 2D DCT wavelet decomposition was used as a regularization term. The Preconditioned Alternating Projection Algorithm (PAPA), which we proposed in earlier work to solve penalized likelihood (PL) reconstruction with non-differentiable regularizers, was used to solve this optimization problem. The DCT wavelet decompositions were performed on the transaxial reconstructed images. We reconstructed Monte Carlo simulated SPECT data obtained for a numerical phantom with Gaussian blobs as hot lesions and with a warm random lumpy background. Reconstructed images using the proposed method exhibited better noise suppression and improved lesion conspicuity, compared with images reconstructed using expectation maximization (EM) algorithm with Gaussian post filter (GPF). Also, the mean square error (MSE) was smaller, compared with EM-GPF. A critical and challenging aspect of this method was selection of optimal parameters. In summary, our numerical experiments demonstrated that the ℓ1-norm of discrete cosine transform (DCT) wavelet frame transform DCT regularizer shows promise for SPECT image reconstruction using PAPA method.
Robust multiperson detection and tracking for mobile service and social robots.
Li, Liyuan; Yan, Shuicheng; Yu, Xinguo; Tan, Yeow Kee; Li, Haizhou
2012-10-01
This paper proposes an efficient system which integrates multiple vision models for robust multiperson detection and tracking for mobile service and social robots in public environments. The core technique is a novel maximum likelihood (ML)-based algorithm which combines the multimodel detections in mean-shift tracking. First, a likelihood probability which integrates detections and similarity to local appearance is defined. Then, an expectation-maximization (EM)-like mean-shift algorithm is derived under the ML framework. In each iteration, the E-step estimates the associations to the detections, and the M-step locates the new position according to the ML criterion. To be robust to the complex crowded scenarios for multiperson tracking, an improved sequential strategy to perform the mean-shift tracking is proposed. Under this strategy, human objects are tracked sequentially according to their priority order. To balance the efficiency and robustness for real-time performance, at each stage, the first two objects from the list of the priority order are tested, and the one with the higher score is selected. The proposed method has been successfully implemented on real-world service and social robots. The vision system integrates stereo-based and histograms-of-oriented-gradients-based human detections, occlusion reasoning, and sequential mean-shift tracking. Various examples to show the advantages and robustness of the proposed system for multiperson tracking from mobile robots are presented. Quantitative evaluations on the performance of multiperson tracking are also performed. Experimental results indicate that significant improvements have been achieved by using the proposed method.
NASA Astrophysics Data System (ADS)
Pan, Zhen; Anderes, Ethan; Knox, Lloyd
2018-05-01
One of the major targets for next-generation cosmic microwave background (CMB) experiments is the detection of the primordial B-mode signal. Planning is under way for Stage-IV experiments that are projected to have instrumental noise small enough to make lensing and foregrounds the dominant source of uncertainty for estimating the tensor-to-scalar ratio r from polarization maps. This makes delensing a crucial part of future CMB polarization science. In this paper we present a likelihood method for estimating the tensor-to-scalar ratio r from CMB polarization observations, which combines the benefits of a full-scale likelihood approach with the tractability of the quadratic delensing technique. This method is a pixel space, all order likelihood analysis of the quadratic delensed B modes, and it essentially builds upon the quadratic delenser by taking into account all order lensing and pixel space anomalies. Its tractability relies on a crucial factorization of the pixel space covariance matrix of the polarization observations which allows one to compute the full Gaussian approximate likelihood profile, as a function of r , at the same computational cost of a single likelihood evaluation.
NASA Astrophysics Data System (ADS)
Nourali, Mahrouz; Ghahraman, Bijan; Pourreza-Bilondi, Mohsen; Davary, Kamran
2016-09-01
In the present study, DREAM(ZS), Differential Evolution Adaptive Metropolis combined with both formal and informal likelihood functions, is used to investigate uncertainty of parameters of the HEC-HMS model in Tamar watershed, Golestan province, Iran. In order to assess the uncertainty of 24 parameters used in HMS, three flood events were used to calibrate and one flood event was used to validate the posterior distributions. Moreover, performance of seven different likelihood functions (L1-L7) was assessed by means of DREAM(ZS)approach. Four likelihood functions, L1-L4, Nash-Sutcliffe (NS) efficiency, Normalized absolute error (NAE), Index of agreement (IOA), and Chiew-McMahon efficiency (CM), is considered as informal, whereas remaining (L5-L7) is represented in formal category. L5 focuses on the relationship between the traditional least squares fitting and the Bayesian inference, and L6, is a hetereoscedastic maximum likelihood error (HMLE) estimator. Finally, in likelihood function L7, serial dependence of residual errors is accounted using a first-order autoregressive (AR) model of the residuals. According to the results, sensitivities of the parameters strongly depend on the likelihood function, and vary for different likelihood functions. Most of the parameters were better defined by formal likelihood functions L5 and L7 and showed a high sensitivity to model performance. Posterior cumulative distributions corresponding to the informal likelihood functions L1, L2, L3, L4 and the formal likelihood function L6 are approximately the same for most of the sub-basins, and these likelihood functions depict almost a similar effect on sensitivity of parameters. 95% total prediction uncertainty bounds bracketed most of the observed data. Considering all the statistical indicators and criteria of uncertainty assessment, including RMSE, KGE, NS, P-factor and R-factor, results showed that DREAM(ZS) algorithm performed better under formal likelihood functions L5 and L7, but likelihood function L5 may result in biased and unreliable estimation of parameters due to violation of the residualerror assumptions. Thus, likelihood function L7 provides posterior distribution of model parameters credibly and therefore can be employed for further applications.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
What are hierarchical models and how do we analyze them?
Royle, Andy
2016-01-01
In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)
NASA Astrophysics Data System (ADS)
Trifonov, A. P.; Korchagin, Yu. E.; Korol'kov, S. V.
2018-05-01
We synthesize the quasi-likelihood, maximum-likelihood, and quasioptimal algorithms for estimating the arrival time and duration of a radio signal with unknown amplitude and initial phase. The discrepancies between the hardware and software realizations of the estimation algorithm are shown. The characteristics of the synthesized-algorithm operation efficiency are obtained. Asymptotic expressions for the biases, variances, and the correlation coefficient of the arrival-time and duration estimates, which hold true for large signal-to-noise ratios, are derived. The accuracy losses of the estimates of the radio-signal arrival time and duration because of the a priori ignorance of the amplitude and initial phase are determined.
Grummer, Jared A; Bryson, Robert W; Reeder, Tod W
2014-03-01
Current molecular methods of species delimitation are limited by the types of species delimitation models and scenarios that can be tested. Bayes factors allow for more flexibility in testing non-nested species delimitation models and hypotheses of individual assignment to alternative lineages. Here, we examined the efficacy of Bayes factors in delimiting species through simulations and empirical data from the Sceloporus scalaris species group. Marginal-likelihood scores of competing species delimitation models, from which Bayes factor values were compared, were estimated with four different methods: harmonic mean estimation (HME), smoothed harmonic mean estimation (sHME), path-sampling/thermodynamic integration (PS), and stepping-stone (SS) analysis. We also performed model selection using a posterior simulation-based analog of the Akaike information criterion through Markov chain Monte Carlo analysis (AICM). Bayes factor species delimitation results from the empirical data were then compared with results from the reversible-jump MCMC (rjMCMC) coalescent-based species delimitation method Bayesian Phylogenetics and Phylogeography (BP&P). Simulation results show that HME and sHME perform poorly compared with PS and SS marginal-likelihood estimators when identifying the true species delimitation model. Furthermore, Bayes factor delimitation (BFD) of species showed improved performance when species limits are tested by reassigning individuals between species, as opposed to either lumping or splitting lineages. In the empirical data, BFD through PS and SS analyses, as well as the rjMCMC method, each provide support for the recognition of all scalaris group taxa as independent evolutionary lineages. Bayes factor species delimitation and BP&P also support the recognition of three previously undescribed lineages. In both simulated and empirical data sets, harmonic and smoothed harmonic mean marginal-likelihood estimators provided much higher marginal-likelihood estimates than PS and SS estimators. The AICM displayed poor repeatability in both simulated and empirical data sets, and produced inconsistent model rankings across replicate runs with the empirical data. Our results suggest that species delimitation through the use of Bayes factors with marginal-likelihood estimates via PS or SS analyses provide a useful and complementary alternative to existing species delimitation methods.
Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric
2010-01-01
It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
INFERRING THE ECCENTRICITY DISTRIBUTION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogg, David W.; Bovy, Jo; Myers, Adam D., E-mail: david.hogg@nyu.ed
2010-12-20
Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementationmore » of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision-other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts-so long as the measurements have been communicated as a likelihood function or a posterior sampling.« less
Precision Parameter Estimation and Machine Learning
NASA Astrophysics Data System (ADS)
Wandelt, Benjamin D.
2008-12-01
I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
NASA Astrophysics Data System (ADS)
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-07-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
Effects of time-shifted data on flight determined stability and control derivatives
NASA Technical Reports Server (NTRS)
Steers, S. T.; Iliff, K. W.
1975-01-01
Flight data were shifted in time by various increments to assess the effects of time shifts on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there was a considerable time shift in the data. Time shifts degraded the estimates of the derivatives, but the degradation was in a consistent rather than a random pattern. Time shifts in the control variables caused the most degradation, and the lateral-directional rotary derivatives were affected the most by time shifts in any variable.
Beyond valence in the perception of likelihood: the role of emotion specificity.
DeSteno, D; Petty, R E; Wegener, D T; Rucker, D D
2000-03-01
Positive and negative moods have been shown to increase likelihood estimates of future events matching these states in valence (e.g., E. J. Johnson & A. Tversky, 1983). In the present article, 4 studies provide evidence that this congruency bias (a) is not limited to valence but functions in an emotion-specific manner, (b) derives from the informational value of emotions, and (c) is not the inevitable outcome of likelihood assessment under heightened emotion. Specifically, Study 1 demonstrates that sadness and anger, 2 distinct, negative emotions, differentially bias likelihood estimates of sad and angering events. Studies 2 and 3 replicate this finding in addition to supporting an emotion-as-information (cf. N. Schwarz & G. L. Clore, 1983), as opposed to a memory-based, mediating process for the bias. Finally, Study 4 shows that when the source of the emotion is salient, a reversal of the bias can occur given greater cognitive effort aimed at accuracy.
Vigan, Marie; Stirnemann, Jérôme; Mentré, France
2014-05-01
Analysis of repeated time-to-event data is increasingly performed in pharmacometrics using parametric frailty models. The aims of this simulation study were (1) to assess estimation performance of Stochastic Approximation Expectation Maximization (SAEM) algorithm in MONOLIX, Adaptive Gaussian Quadrature (AGQ), and Laplace algorithm in PROC NLMIXED of SAS and (2) to evaluate properties of test of a dichotomous covariate on occurrence of events. The simulation setting is inspired from an analysis of occurrence of bone events after the initiation of treatment by imiglucerase in patients with Gaucher Disease (GD). We simulated repeated events with an exponential model and various dropout rates: no, low, or high. Several values of baseline hazard model, variability, number of subject, and effect of covariate were studied. For each scenario, 100 datasets were simulated for estimation performance and 500 for test performance. We evaluated estimation performance through relative bias and relative root mean square error (RRMSE). We studied properties of Wald and likelihood ratio test (LRT). We used these methods to analyze occurrence of bone events in patients with GD after starting an enzyme replacement therapy. SAEM with three chains and AGQ algorithms provided good estimates of parameters much better than SAEM with one chain and Laplace which often provided poor estimates. Despite a small number of repeated events, SAEM with three chains and AGQ gave small biases and RRMSE. Type I errors were closed to 5%, and power varied as expected for SAEM with three chains and AGQ. Probability of having at least one event under treatment was 19.1%.
Pointwise nonparametric maximum likelihood estimator of stochastically ordered survivor functions
Park, Yongseok; Taylor, Jeremy M. G.; Kalbfleisch, John D.
2012-01-01
In this paper, we consider estimation of survivor functions from groups of observations with right-censored data when the groups are subject to a stochastic ordering constraint. Many methods and algorithms have been proposed to estimate distribution functions under such restrictions, but none have completely satisfactory properties when the observations are censored. We propose a pointwise constrained nonparametric maximum likelihood estimator, which is defined at each time t by the estimates of the survivor functions subject to constraints applied at time t only. We also propose an efficient method to obtain the estimator. The estimator of each constrained survivor function is shown to be nonincreasing in t, and its consistency and asymptotic distribution are established. A simulation study suggests better small and large sample properties than for alternative estimators. An example using prostate cancer data illustrates the method. PMID:23843661
Closed-loop carrier phase synchronization techniques motivated by likelihood functions
NASA Technical Reports Server (NTRS)
Tsou, H.; Hinedi, S.; Simon, M.
1994-01-01
This article reexamines the notion of closed-loop carrier phase synchronization motivated by the theory of maximum a posteriori phase estimation with emphasis on the development of new structures based on both maximum-likelihood and average-likelihood functions. The criterion of performance used for comparison of all the closed-loop structures discussed is the mean-squared phase error for a fixed-loop bandwidth.
Kanada, Yoshikiyo; Sakurai, Hiroaki; Sugiura, Yoshito; Arai, Tomoaki; Koyama, Soichiro; Tanabe, Shigeo
2017-11-01
[Purpose] To create a regression formula in order to estimate 1RM for knee extensors, based on the maximal isometric muscle strength measured using a hand-held dynamometer and data regarding the body composition. [Subjects and Methods] Measurement was performed in 21 healthy males in their twenties to thirties. Single regression analysis was performed, with measurement values representing 1RM and the maximal isometric muscle strength as dependent and independent variables, respectively. Furthermore, multiple regression analysis was performed, with data regarding the body composition incorporated as another independent variable, in addition to the maximal isometric muscle strength. [Results] Through single regression analysis with the maximal isometric muscle strength as an independent variable, the following regression formula was created: 1RM (kg)=0.714 + 0.783 × maximal isometric muscle strength (kgf). On multiple regression analysis, only the total muscle mass was extracted. [Conclusion] A highly accurate regression formula to estimate 1RM was created based on both the maximal isometric muscle strength and body composition. Using a hand-held dynamometer and body composition analyzer, it was possible to measure these items in a short time, and obtain clinically useful results.
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.
dos Reis, Mario; Yang, Ziheng
2011-07-01
The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
Lirio, R B; Dondériz, I C; Pérez Abalo, M C
1992-08-01
The methodology of Receiver Operating Characteristic curves based on the signal detection model is extended to evaluate the accuracy of two-stage diagnostic strategies. A computer program is developed for the maximum likelihood estimation of parameters that characterize the sensitivity and specificity of two-stage classifiers according to this extended methodology. Its use is briefly illustrated with data collected in a two-stage screening for auditory defects.
ERIC Educational Resources Information Center
Kelderman, Henk
In this paper, algorithms are described for obtaining the maximum likelihood estimates of the parameters in log-linear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual counts in the full contingency table. This is…
NASA Astrophysics Data System (ADS)
Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith
2018-01-01
This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.
NASA Technical Reports Server (NTRS)
Iliff, Kenneth W.
1987-01-01
The aircraft parameter estimation problem is used to illustrate the utility of parameter estimation, which applies to many engineering and scientific fields. Maximum likelihood estimation has been used to extract stability and control derivatives from flight data for many years. This paper presents some of the basic concepts of aircraft parameter estimation and briefly surveys the literature in the field. The maximum likelihood estimator is discussed, and the basic concepts of minimization and estimation are examined for a simple simulated aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Some of the major conclusions for the simulated example are also developed for the analysis of flight data from the F-14, highly maneuverable aircraft technology (HiMAT), and space shuttle vehicles.
Estimation of submarine mass failure probability from a sequence of deposits with age dates
Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.
2013-01-01
The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Rates of spontaneous mutation in an archaeon from geothermal environments.
Jacobs, K L; Grogan, D W
1997-01-01
To estimate the efficacy of mechanisms which may prevent or repair thermal damage to DNA in thermophilic archaea, a quantitative assay of forward mutation at extremely high temperature was developed for Sulfolobus acidocaldarius, based on the selection of pyrimidine-requiring mutants resistant to 5-fluoro-orotic acid. Maximum-likelihood analysis of spontaneous mutant distributions in wild-type cultures yielded maximal estimates of (2.8 +/- 0.7) x 10(-7) and (1.5 +/- 0.6) x 10(-7) mutational events per cell per division cycle for the pyrE and pyrF loci, respectively. To our knowledge, these results provide the first accurate measurement of the genetic fidelity maintained by archaea that populate geothermal environments. The measured rates of forward mutation at the pyrE and pyrF loci in S. acidocaldarius are close to corresponding rates reported for protein-encoding genes of Escherichia coli. The normal rate of spontaneous mutation in E. coli at 37 degrees C is known to require the functioning of several enzyme systems that repair spontaneous damage in DNA. Our results provide indirect evidence that S. acidocaldarius has cellular mechanisms, as yet unidentified, which effectively compensate for the higher chemical instability of DNA at the temperatures and pHs that prevail within growing Sulfolobus cells. PMID:9150227
NASA Astrophysics Data System (ADS)
Bragato, P. L.
2017-10-01
The strong earthquakes that occurred in Italy between 2009 and 2016 represent an abrupt acceleration of seismicity in respect of the previous 30 years. Such behavior seems to agree with the periodic rate change I observed in a previous paper. The present work improves that study by extending the data set up to the end of 2016, adopting the latest version of the historical seismic catalog of Italy, and introducing Schuster spectrum analysis for the detection of the oscillatory period and the assessment of its statistical significance. Applied to the declustered catalog of M w ≥ 6 earthquakes that occurred between 1600 and 2016, the analysis individuates a marked periodicity of 46 years, which is recognized above the 95% confidence level. Monte Carlo simulation shows that the oscillatory behavior is stable in respect of random errors on magnitude estimation. A parametric oscillatory model for the annual rate of seismicity is estimated by likelihood maximization under the hypothesis of inhomogeneous Poisson point process. According to the Akaike Information Criterion, such model outperforms the simpler homogeneous one with constant annual rate. A further element emerges form the analysis: so far, despite recent earthquakes, the Italian seismicity is still within a long-term decreasing trend established since the first half of the twentieth century.
Maximum entropy approach to statistical inference for an ocean acoustic waveguide.
Knobles, D P; Sagers, J D; Koch, R A
2012-02-01
A conditional probability distribution suitable for estimating the statistical properties of ocean seabed parameter values inferred from acoustic measurements is derived from a maximum entropy principle. The specification of the expectation value for an error function constrains the maximization of an entropy functional. This constraint determines the sensitivity factor (β) to the error function of the resulting probability distribution, which is a canonical form that provides a conservative estimate of the uncertainty of the parameter values. From the conditional distribution, marginal distributions for individual parameters can be determined from integration over the other parameters. The approach is an alternative to obtaining the posterior probability distribution without an intermediary determination of the likelihood function followed by an application of Bayes' rule. In this paper the expectation value that specifies the constraint is determined from the values of the error function for the model solutions obtained from a sparse number of data samples. The method is applied to ocean acoustic measurements taken on the New Jersey continental shelf. The marginal probability distribution for the values of the sound speed ratio at the surface of the seabed and the source levels of a towed source are examined for different geoacoustic model representations. © 2012 Acoustical Society of America
Bayesian Recurrent Neural Network for Language Modeling.
Chien, Jen-Tzung; Ku, Yuan-Chu
2016-02-01
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
NASA Astrophysics Data System (ADS)
Faber, T. L.; Raghunath, N.; Tudorascu, D.; Votaw, J. R.
2009-02-01
Image quality is significantly degraded even by small amounts of patient motion in very high-resolution PET scanners. Existing correction methods that use known patient motion obtained from tracking devices either require multi-frame acquisitions, detailed knowledge of the scanner, or specialized reconstruction algorithms. A deconvolution algorithm has been developed that alleviates these drawbacks by using the reconstructed image to estimate the original non-blurred image using maximum likelihood estimation maximization (MLEM) techniques. A high-resolution digital phantom was created by shape-based interpolation of the digital Hoffman brain phantom. Three different sets of 20 movements were applied to the phantom. For each frame of the motion, sinograms with attenuation and three levels of noise were simulated and then reconstructed using filtered backprojection. The average of the 20 frames was considered the motion blurred image, which was restored with the deconvolution algorithm. After correction, contrast increased from a mean of 2.0, 1.8 and 1.4 in the motion blurred images, for the three increasing amounts of movement, to a mean of 2.5, 2.4 and 2.2. Mean error was reduced by an average of 55% with motion correction. In conclusion, deconvolution can be used for correction of motion blur when subject motion is known.
A joint frailty-copula model between tumour progression and death for meta-analysis.
Emura, Takeshi; Nakatochi, Masahiro; Murotani, Kenta; Rondeau, Virginie
2017-12-01
Dependent censoring often arises in biomedical studies when time to tumour progression (e.g., relapse of cancer) is censored by an informative terminal event (e.g., death). For meta-analysis combining existing studies, a joint survival model between tumour progression and death has been considered under semicompeting risks, which induces dependence through the study-specific frailty. Our paper here utilizes copulas to generalize the joint frailty model by introducing additional source of dependence arising from intra-subject association between tumour progression and death. The practical value of the new model is particularly evident for meta-analyses in which only a few covariates are consistently measured across studies and hence there exist residual dependence. The covariate effects are formulated through the Cox proportional hazards model, and the baseline hazards are nonparametrically modeled on a basis of splines. The estimator is then obtained by maximizing a penalized log-likelihood function. We also show that the present methodologies are easily modified for the competing risks or recurrent event data, and are generalized to accommodate left-truncation. Simulations are performed to examine the performance of the proposed estimator. The method is applied to a meta-analysis for assessing a recently suggested biomarker CXCL12 for survival in ovarian cancer patients. We implement our proposed methods in R joint.Cox package.
Jiang, Z; Dou, Z; Song, W L; Xu, J; Wu, Z Y
2017-11-10
Objective: To compare results of different methods: in organizing HIV viral load (VL) data with missing values mechanism. Methods We used software SPSS 17.0 to simulate complete and missing data with different missing value mechanism from HIV viral loading data collected from MSM in 16 cities in China in 2013. Maximum Likelihood Methods Using the Expectation and Maximization Algorithm (EM), regressive method, mean imputation, delete method, and Markov Chain Monte Carlo (MCMC) were used to supplement missing data respectively. The results: of different methods were compared according to distribution characteristics, accuracy and precision. Results HIV VL data could not be transferred into a normal distribution. All the methods showed good results in iterating data which is Missing Completely at Random Mechanism (MCAR). For the other types of missing data, regressive and MCMC methods were used to keep the main characteristic of the original data. The means of iterating database with different methods were all close to the original one. The EM, regressive method, mean imputation, and delete method under-estimate VL while MCMC overestimates it. Conclusion: MCMC can be used as the main imputation method for HIV virus loading missing data. The iterated data can be used as a reference for mean HIV VL estimation among the investigated population.
Pattern recognition for passive polarimetric data using nonparametric classifiers
NASA Astrophysics Data System (ADS)
Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.
2005-08-01
Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.
Gil, Manuel
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
NASA Astrophysics Data System (ADS)
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.
Psychometric Properties of IRT Proficiency Estimates
ERIC Educational Resources Information Center
Kolen, Michael J.; Tong, Ye
2010-01-01
Psychometric properties of item response theory proficiency estimates are considered in this paper. Proficiency estimators based on summed scores and pattern scores include non-Bayes maximum likelihood and test characteristic curve estimators and Bayesian estimators. The psychometric properties investigated include reliability, conditional…
Likelihoods for fixed rank nomination networks
HOFF, PETER; FOSDICK, BAILEY; VOLFOVSKY, ALEX; STOVEL, KATHERINE
2014-01-01
Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design. PMID:25110586
Maximizing the significance in Higgs boson pair analyses [Mad-Maximized Higgs Pair Analyses
Kling, Felix; Plehn, Tilman; Schichtel, Peter
2017-02-22
Here, we study Higgs pair production with a subsequent decay to a pair of photons and a pair of bottoms at the LHC. We use the log-likelihood ratio to identify the kinematic regions which either allow us to separate the di-Higgs signal from backgrounds or to determine the Higgs self-coupling. We find that both regions are separate enough to ensure that details of the background modeling will not affect the determination of the self-coupling. Assuming dominant statistical uncertainties we determine the best precision with which the Higgs self-coupling can be probed in this channel. We finally comment on the samemore » questions at a future 100 TeV collider.« less
Maximizing the significance in Higgs boson pair analyses [Mad-Maximized Higgs Pair Analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kling, Felix; Plehn, Tilman; Schichtel, Peter
Here, we study Higgs pair production with a subsequent decay to a pair of photons and a pair of bottoms at the LHC. We use the log-likelihood ratio to identify the kinematic regions which either allow us to separate the di-Higgs signal from backgrounds or to determine the Higgs self-coupling. We find that both regions are separate enough to ensure that details of the background modeling will not affect the determination of the self-coupling. Assuming dominant statistical uncertainties we determine the best precision with which the Higgs self-coupling can be probed in this channel. We finally comment on the samemore » questions at a future 100 TeV collider.« less
Maximizing Friend-Making Likelihood for Social Activity Organization
2015-05-22
selected individuals may not be able to socialize with each other effectively . 5.2 Performance Evaluation Baseline can only find the optimal solutions of...datasets to demonstrate the efficiency and effectiveness of our proposed algorithm. 1 Introduction With the popularity and accessibility of online...to-face activities are initiated in Meetup3. The activities organized via OSNs cover a wide variety of purposes, e.g., friend gatherings, cocktail
Biological Weapons and Modern Warfare
1991-04-01
every preparation for reducing Its effectiveness and thereby reduce the likelihood of Its use. In order to plan such preparation, It is advantageous to...attack rates could be maximized and the forces using the weapon protected from its effects . In today’s climate, BW agents are also attractive weapons...questions about the agreement’s true effectiveness . Verification of compliance was not addressed. D. World War It: Events during and following World War
Modeling the rejection probability in plant imports.
Surkov, I V; van der Werf, W; van Kooten, O; Lansink, A G J M Oude
2008-06-01
Phytosanitary inspection of imported plants and flowers is a major means for preventing pest invasions through international trade, but in a majority of countries availability of resources prevents inspection of all imports. Prediction of the likelihood of pest infestation in imported shipments could help maximize the efficiency of inspection by targeting inspection on shipments with the highest likelihood of infestation. This paper applies a multinomial logistic (MNL) regression model to data on import inspections of ornamental plant commodities in the Netherlands from 1998 to 2001 to investigate whether it is possible to predict the probability that a shipment will be (i) accepted for import, (ii) rejected for import because of detected pests, or (iii) rejected due to other reasons. Four models were estimated: (i) an all-species model, including all plant imports (136,251 shipments) in the data set, (ii) a four-species model, including records on the four ornamental commodities that accounted for 28.9% of inspected and 49.5% of rejected shipments, and two models for single commodities with large import volumes and percentages of rejections, (iii) Dianthus (16.9% of inspected and 46.3% of rejected shipments), and (iv) Chrysanthemum (6.9 and 8.6%, respectively). All models were highly significant (P < 0.001). The models for Dianthus and Chrysanthemum and for the set of four ornamental commodities showed a better fit to data than the model for all ornamental commodities. Variables that characterized the imported shipment's region of origin, the shipment's size, the company that imported the shipment, and season and year of import, were significant in most of the estimated models. The combined results of this study suggest that the MNL model can be a useful tool for modeling the probability of rejecting imported commodities even with a small set of explanatory variables. The MNL model can be helpful in better targeting of resources for import inspection. The inspecting agencies could enable development of these models by appropriately recording inspection results.
NASA Technical Reports Server (NTRS)
Parrish, R. V.; Steinmetz, G. G.
1972-01-01
A method of parameter extraction for stability and control derivatives of aircraft from flight test data, implementing maximum likelihood estimation, has been developed and successfully applied to actual lateral flight test data from a modern sophisticated jet fighter. This application demonstrates the important role played by the analyst in combining engineering judgment and estimator statistics to yield meaningful results. During the analysis, the problems of uniqueness of the extracted set of parameters and of longitudinal coupling effects were encountered and resolved. The results for all flight runs are presented in tabular form and as time history comparisons between the estimated states and the actual flight test data.
Effect of sampling rate and record length on the determination of stability and control derivatives
NASA Technical Reports Server (NTRS)
Brenner, M. J.; Iliff, K. W.; Whitman, R. K.
1978-01-01
Flight data from five aircraft were used to assess the effects of sampling rate and record length reductions on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there were considerable reductions in sampling rate and/or record length. Small amplitude pulse maneuvers showed greater degradation of the derivative maneuvers than large amplitude pulse maneuvers when these reductions were made. Reducing the sampling rate was found to be more desirable than reducing the record length as a method of lessening the total computation time required without greatly degrading the quantity of the estimates.
Characterization, parameter estimation, and aircraft response statistics of atmospheric turbulence
NASA Technical Reports Server (NTRS)
Mark, W. D.
1981-01-01
A nonGaussian three component model of atmospheric turbulence is postulated that accounts for readily observable features of turbulence velocity records, their autocorrelation functions, and their spectra. Methods for computing probability density functions and mean exceedance rates of a generic aircraft response variable are developed using nonGaussian turbulence characterizations readily extracted from velocity recordings. A maximum likelihood method is developed for optimal estimation of the integral scale and intensity of records possessing von Karman transverse of longitudinal spectra. Formulas for the variances of such parameter estimates are developed. The maximum likelihood and least-square approaches are combined to yield a method for estimating the autocorrelation function parameters of a two component model for turbulence.
Objectively combining AR5 instrumental period and paleoclimate climate sensitivity evidence
NASA Astrophysics Data System (ADS)
Lewis, Nicholas; Grünwald, Peter
2018-03-01
Combining instrumental period evidence regarding equilibrium climate sensitivity with largely independent paleoclimate proxy evidence should enable a more constrained sensitivity estimate to be obtained. Previous, subjective Bayesian approaches involved selection of a prior probability distribution reflecting the investigators' beliefs about climate sensitivity. Here a recently developed approach employing two different statistical methods—objective Bayesian and frequentist likelihood-ratio—is used to combine instrumental period and paleoclimate evidence based on data presented and assessments made in the IPCC Fifth Assessment Report. Probabilistic estimates from each source of evidence are represented by posterior probability density functions (PDFs) of physically-appropriate form that can be uniquely factored into a likelihood function and a noninformative prior distribution. The three-parameter form is shown accurately to fit a wide range of estimated climate sensitivity PDFs. The likelihood functions relating to the probabilistic estimates from the two sources are multiplicatively combined and a prior is derived that is noninformative for inference from the combined evidence. A posterior PDF that incorporates the evidence from both sources is produced using a single-step approach, which avoids the order-dependency that would arise if Bayesian updating were used. Results are compared with an alternative approach using the frequentist signed root likelihood ratio method. Results from these two methods are effectively identical, and provide a 5-95% range for climate sensitivity of 1.1-4.05 K (median 1.87 K).
Meyer, Karin; Kirkpatrick, Mark
2005-01-01
Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
A simulation study on Bayesian Ridge regression models for several collinearity levels
NASA Astrophysics Data System (ADS)
Efendi, Achmad; Effrihan
2017-12-01
When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.
Nordgren, Birgitta; Fridén, Cecilia; Jansson, Eva; Österlund, Ted; Grooten, Wilhelmus Johannes; Opava, Christina H; Rickenlund, Anette
2014-09-17
Aerobic capacity tests are important to evaluate exercise programs and to encourage individuals to have a physically active lifestyle. Submaximal tests, if proven valid and reliable could be used for estimation of maximal oxygen uptake (VO2max). The purpose of the study was to examine the criterion-validity of the submaximal self-monitoring Fox-walk test and the submaximal Åstrand cycle test against a maximal cycle test in people with rheumatoid arthritis (RA). A secondary aim was to study the influence of different formulas for age predicted maximal heart rate when estimating VO2max by the Åstrand test. Twenty seven subjects (81% female), mean (SD) age 62 (8.1) years, diagnosed with RA since 17.9 (11.7) years, participated in the study. They performed the Fox-walk test (775 meters), the Åstrand test and the maximal cycle test (measured VO2max test). Pearson's correlation coefficients were calculated to determine the direction and strength of the association between the tests, and paired t-tests were used to test potential differences between the tests. Bland and Altman methods were used to assess whether there was any systematic disagreement between the submaximal tests and the maximal test. The correlation between the estimated and measured VO2max values were strong and ranged between r = 0.52 and r = 0.82 including the use of different formulas for age predicted maximal heart rate, when estimating VO2max by the Åstrand test. VO2max was overestimated by 30% by the Fox-walk test and underestimated by 10% by the Åstrand test corrected for age. When the different formulas for age predicted maximal heart rate were used, the results showed that two formulas better predicted maximal heart rate and consequently a more precise estimation of VO2max. Despite the fact that the Fox-walk test overestimated VO2max substantially, the test is a promising method for self-monitoring VO2max and further development of the test is encouraged. The Åstrand test should be considered as highly valid and feasible and the two newly developed formulas for predicting maximal heart rate according to age are preferable to use when estimating VO2max by the Åstrand test.
Relationships between maximal anaerobic power of the arms and legs and javelin performance.
Bouhlel, E; Chelly, M S; Tabka, Z; Shephard, R
2007-06-01
The aim of this study was to examine relationships between maximal anaerobic power, as measured by leg and arm force-velocity tests, estimates of local muscle volume and javelin performance. Ten trained national level male javelin throwers (mean age 19.6+/- 2 years) participated in this study. Maximal anaerobic power, maximal force and maximal velocity were measured during leg (Wmax-L) and arm (Wmax-A) force-velocity tests, performed on appropriately modified forms of Monark cycle ergometer. Estimates of leg and arm muscle volume were made using a standard anthropometric kit. Maximal force of the leg (Fmax-L) was significantly correlated with estimated leg muscle volume (r=0.71, P<0.05). Wmax-L and Wmax-A were both significantly correlated with javelin performance (r=0.76, P<0.01; r=0.71, P <0.05, respectively). Maximal velocity of the leg (Vmax-L) was also significantly correlated with throwing performance (r=0.83; P<0.001). Wmax of both legs and arms were significantly correlated with javelin performance, the closest correlation being for Wmax-L; this emphasizes the importance of the leg muscles in this sport. Fmax-L and Vmax-L were related to muscle volume and to javelin performance, respectively. Force-velocity testing may have value in regulating conditioning and rehabilitation in sports involving throwing.
Exponential series approaches for nonparametric graphical models
NASA Astrophysics Data System (ADS)
Janofsky, Eric
Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model
Xu, Bo; Yang, Ziheng
2016-01-01
The multispecies coalescent (MSC) model has emerged as a powerful framework for inferring species phylogenies while accounting for ancestral polymorphism and gene tree-species tree conflict. A number of methods have been developed in the past few years to estimate the species tree under the MSC. The full likelihood methods (including maximum likelihood and Bayesian inference) average over the unknown gene trees and accommodate their uncertainties properly but involve intensive computation. The approximate or summary coalescent methods are computationally fast and are applicable to genomic datasets with thousands of loci, but do not make an efficient use of information in the multilocus data. Most of them take the two-step approach of reconstructing the gene trees for multiple loci by phylogenetic methods and then treating the estimated gene trees as observed data, without accounting for their uncertainties appropriately. In this article we review the statistical nature of the species tree estimation problem under the MSC, and explore the conceptual issues and challenges of species tree estimation by focusing mainly on simple cases of three or four closely related species. We use mathematical analysis and computer simulation to demonstrate that large differences in statistical performance may exist between the two classes of methods. We illustrate that several counterintuitive behaviors may occur with the summary methods but they are due to inefficient use of information in the data by summary methods and vanish when the data are analyzed using full-likelihood methods. These include (i) unidentifiability of parameters in the model, (ii) inconsistency in the so-called anomaly zone, (iii) singularity on the likelihood surface, and (iv) deterioration of performance upon addition of more data. We discuss the challenges and strategies of species tree inference for distantly related species when the molecular clock is violated, and highlight the need for improving the computational efficiency and model realism of the likelihood methods as well as the statistical efficiency of the summary methods. PMID:27927902
A proposed selection index for feedlot profitability based on estimated breeding values.
van der Westhuizen, R R; van der Westhuizen, J
2009-04-22
It is generally accepted that feed intake and growth (gain) are the most important economic components when calculating profitability in a growth test or feedlot. We developed a single post-weaning growth (feedlot) index based on the economic values of different components. Variance components, heritabilities and genetic correlations for and between initial weight (IW), final weight (FW), feed intake (FI), and shoulder height (SHD) were estimated by multitrait restricted maximum likelihood procedures. The estimated breeding values (EBVs) and the economic values for IW, FW and FI were used in a selection index to estimate a post-weaning or feedlot profitability value. Heritabilities for IW, FW, FI, and SHD were 0.41, 0.40, 0.33, and 0.51, respectively. The highest genetic correlations were 0.78 (between IW and FW) and 0.70 (between FI and FW). EBVs were used in a selection index to calculate a single economical value for each animal. This economic value is an indication of the gross profitability value or the gross test value (GTV) of the animal in a post-weaning growth test. GTVs varied between -R192.17 and R231.38 with an average of R9.31 and a standard deviation of R39.96. The Pearson correlations between EBVs (for production and efficiency traits) and GTV ranged from -0.51 to 0.68. The lowest correlation (closest to zero) was 0.26 between the Kleiber ratio and GTV. Correlations of 0.68 and -0.51 were estimated between average daily gain and GTV and feed conversion ratio and GTV, respectively. These results showed that it is possible to select for GTV. The selection index can benefit feedlotting in selecting offspring of bulls with high GTVs to maximize profitability.
Cheung, Li C; Pan, Qing; Hyun, Noorie; Schiffman, Mark; Fetterman, Barbara; Castle, Philip E; Lorey, Thomas; Katki, Hormuzd A
2017-09-30
For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early time points and overestimates later risks. We propose a general family of mixture models for undiagnosed prevalent disease and interval-censored incident disease that we call prevalence-incidence models. Parameters for parametric prevalence-incidence models, such as the logistic regression and Weibull survival (logistic-Weibull) model, are estimated by direct likelihood maximization or by EM algorithm. Non-parametric methods are proposed to calculate cumulative risks for cases without covariates. We compare naive Kaplan-Meier, logistic-Weibull, and non-parametric estimates of cumulative risk in the cervical cancer screening program at Kaiser Permanente Northern California. Kaplan-Meier provided poor estimates while the logistic-Weibull model was a close fit to the non-parametric. Our findings support our use of logistic-Weibull models to develop the risk estimates that underlie current US risk-based cervical cancer screening guidelines. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA.
COSMIC MICROWAVE BACKGROUND LIKELIHOOD APPROXIMATION FOR BANDED PROBABILITY DISTRIBUTIONS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gjerløw, E.; Mikkelsen, K.; Eriksen, H. K.
We investigate sets of random variables that can be arranged sequentially such that a given variable only depends conditionally on its immediate predecessor. For such sets, we show that the full joint probability distribution may be expressed exclusively in terms of uni- and bivariate marginals. Under the assumption that the cosmic microwave background (CMB) power spectrum likelihood only exhibits correlations within a banded multipole range, Δl{sub C}, we apply this expression to two outstanding problems in CMB likelihood analysis. First, we derive a statistically well-defined hybrid likelihood estimator, merging two independent (e.g., low- and high-l) likelihoods into a single expressionmore » that properly accounts for correlations between the two. Applying this expression to the Wilkinson Microwave Anisotropy Probe (WMAP) likelihood, we verify that the effect of correlations on cosmological parameters in the transition region is negligible in terms of cosmological parameters for WMAP; the largest relative shift seen for any parameter is 0.06σ. However, because this may not hold for other experimental setups (e.g., for different instrumental noise properties or analysis masks), but must rather be verified on a case-by-case basis, we recommend our new hybridization scheme for future experiments for statistical self-consistency reasons. Second, we use the same expression to improve the convergence rate of the Blackwell-Rao likelihood estimator, reducing the required number of Monte Carlo samples by several orders of magnitude, and thereby extend it to high-l applications.« less
2015-01-01
This research has the purpose to establish a foundation for new classification and estimation of CDMA signals. Keywords: DS / CDMA signals, BPSK, QPSK...DEVELOPMENT OF THE AVERAGE LIKELIHOOD FUNCTION FOR CODE DIVISION MULTIPLE ACCESS ( CDMA ) USING BPSK AND QPSK SYMBOLS JANUARY 2015...To) OCT 2013 – OCT 2014 4. TITLE AND SUBTITLE DEVELOPMENT OF THE AVERAGE LIKELIHOOD FUNCTION FOR CODE DIVISION MULTIPLE ACCESS ( CDMA ) USING BPSK
Estimation of brood and nest survival: Comparative methods in the presence of heterogeneity
Manly, Bryan F.J.; Schmutz, Joel A.
2001-01-01
The Mayfield method has been widely used for estimating survival of nests and young animals, especially when data are collected at irregular observation intervals. However, this method assumes survival is constant throughout the study period, which often ignores biologically relevant variation and may lead to biased survival estimates. We examined the bias and accuracy of 1 modification to the Mayfield method that allows for temporal variation in survival, and we developed and similarly tested 2 additional methods. One of these 2 new methods is simply an iterative extension of Klett and Johnson's method, which we refer to as the Iterative Mayfield method and bears similarity to Kaplan-Meier methods. The other method uses maximum likelihood techniques for estimation and is best applied to survival of animals in groups or families, rather than as independent individuals. We also examined how robust these estimators are to heterogeneity in the data, which can arise from such sources as dependent survival probabilities among siblings, inherent differences among families, and adoption. Testing of estimator performance with respect to bias, accuracy, and heterogeneity was done using simulations that mimicked a study of survival of emperor goose (Chen canagica) goslings. Assuming constant survival for inappropriately long periods of time or use of Klett and Johnson's methods resulted in large bias or poor accuracy (often >5% bias or root mean square error) compared to our Iterative Mayfield or maximum likelihood methods. Overall, estimator performance was slightly better with our Iterative Mayfield than our maximum likelihood method, but the maximum likelihood method provides a more rigorous framework for testing covariates and explicity models a heterogeneity factor. We demonstrated use of all estimators with data from emperor goose goslings. We advocate that future studies use the new methods outlined here rather than the traditional Mayfield method or its previous modifications.
Fithian, William; Elith, Jane; Hastie, Trevor; Keith, David A
2015-04-01
Presence-only records may provide data on the distributions of rare species, but commonly suffer from large, unknown biases due to their typically haphazard collection schemes. Presence-absence or count data collected in systematic, planned surveys are more reliable but typically less abundant.We proposed a probabilistic model to allow for joint analysis of presence-only and survey data to exploit their complementary strengths. Our method pools presence-only and presence-absence data for many species and maximizes a joint likelihood, simultaneously estimating and adjusting for the sampling bias affecting the presence-only data. By assuming that the sampling bias is the same for all species, we can borrow strength across species to efficiently estimate the bias and improve our inference from presence-only data.We evaluate our model's performance on data for 36 eucalypt species in south-eastern Australia. We find that presence-only records exhibit a strong sampling bias towards the coast and towards Sydney, the largest city. Our data-pooling technique substantially improves the out-of-sample predictive performance of our model when the amount of available presence-absence data for a given species is scarceIf we have only presence-only data and no presence-absence data for a given species, but both types of data for several other species that suffer from the same spatial sampling bias, then our method can obtain an unbiased estimate of the first species' geographic range.
Fithian, William; Elith, Jane; Hastie, Trevor; Keith, David A.
2016-01-01
Summary Presence-only records may provide data on the distributions of rare species, but commonly suffer from large, unknown biases due to their typically haphazard collection schemes. Presence–absence or count data collected in systematic, planned surveys are more reliable but typically less abundant.We proposed a probabilistic model to allow for joint analysis of presence-only and survey data to exploit their complementary strengths. Our method pools presence-only and presence–absence data for many species and maximizes a joint likelihood, simultaneously estimating and adjusting for the sampling bias affecting the presence-only data. By assuming that the sampling bias is the same for all species, we can borrow strength across species to efficiently estimate the bias and improve our inference from presence-only data.We evaluate our model’s performance on data for 36 eucalypt species in south-eastern Australia. We find that presence-only records exhibit a strong sampling bias towards the coast and towards Sydney, the largest city. Our data-pooling technique substantially improves the out-of-sample predictive performance of our model when the amount of available presence–absence data for a given species is scarceIf we have only presence-only data and no presence–absence data for a given species, but both types of data for several other species that suffer from the same spatial sampling bias, then our method can obtain an unbiased estimate of the first species’ geographic range. PMID:27840673
Image enhancement using the hypothesis selection filter: theory and application to JPEG decoding.
Wong, Tak-Shing; Bouman, Charles A; Pollak, Ilya
2013-03-01
We introduce the hypothesis selection filter (HSF) as a new approach for image quality enhancement. We assume that a set of filters has been selected a priori to improve the quality of a distorted image containing regions with different characteristics. At each pixel, HSF uses a locally computed feature vector to predict the relative performance of the filters in estimating the corresponding pixel intensity in the original undistorted image. The prediction result then determines the proportion of each filter used to obtain the final processed output. In this way, the HSF serves as a framework for combining the outputs of a number of different user selected filters, each best suited for a different region of an image. We formulate our scheme in a probabilistic framework where the HSF output is obtained as the Bayesian minimum mean square error estimate of the original image. Maximum likelihood estimates of the model parameters are determined from an offline fully unsupervised training procedure that is derived from the expectation-maximization algorithm. To illustrate how to apply the HSF and to demonstrate its potential, we apply our scheme as a post-processing step to improve the decoding quality of JPEG-encoded document images. The scheme consistently improves the quality of the decoded image over a variety of image content with different characteristics. We show that our scheme results in quantitative improvements over several other state-of-the-art JPEG decoding methods.
Kendall, W.L.; Nichols, J.D.; Hines, J.E.
1997-01-01
Statistical inference for capture-recapture studies of open animal populations typically relies on the assumption that all emigration from the studied population is permanent. However, there are many instances in which this assumption is unlikely to be met. We define two general models for the process of temporary emigration, completely random and Markovian. We then consider effects of these two types of temporary emigration on Jolly-Seber (Seber 1982) estimators and on estimators arising from the full-likelihood approach of Kendall et al. (1995) to robust design data. Capture-recapture data arising from Pollock's (1982) robust design provide the basis for obtaining unbiased estimates of demographic parameters in the presence of temporary emigration and for estimating the probability of temporary emigration. We present a likelihood-based approach to dealing with temporary emigration that permits estimation under different models of temporary emigration and yields tests for completely random and Markovian emigration. In addition, we use the relationship between capture probability estimates based on closed and open models under completely random temporary emigration to derive three ad hoc estimators for the probability of temporary emigration, two of which should be especially useful in situations where capture probabilities are heterogeneous among individual animals. Ad hoc and full-likelihood estimators are illustrated for small mammal capture-recapture data sets. We believe that these models and estimators will be useful for testing hypotheses about the process of temporary emigration, for estimating demographic parameters in the presence of temporary emigration, and for estimating probabilities of temporary emigration. These latter estimates are frequently of ecological interest as indicators of animal movement and, in some sampling situations, as direct estimates of breeding probabilities and proportions.
2015-08-01
McCullagh, P.; Nelder, J.A. Generalized Linear Model , 2nd ed.; Chapman and Hall: London, 1989. 7. Johnston, J. Econometric Methods, 3rd ed.; McGraw...FOR A DOSE-RESPONSE MODEL ECBC-TN-068 Kyong H. Park Steven J. Lagan RESEARCH AND TECHNOLOGY DIRECTORATE August 2015 Approved for public release...Likelihood Estimation Method for Completely Separated and Quasi-Completely Separated Data for a Dose-Response Model 5a. CONTRACT NUMBER 5b. GRANT
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
A Game Theoretical Approach to Hacktivism: Is Attack Likelihood a Product of Risks and Payoffs?
Bodford, Jessica E; Kwan, Virginia S Y
2018-02-01
The current study examines hacktivism (i.e., hacking to convey a moral, ethical, or social justice message) through a general game theoretic framework-that is, as a product of costs and benefits. Given the inherent risk of carrying out a hacktivist attack (e.g., legal action, imprisonment), it would be rational for the user to weigh these risks against perceived benefits of carrying out the attack. As such, we examined computer science students' estimations of risks, payoffs, and attack likelihood through a game theoretic design. Furthermore, this study aims at constructing a descriptive profile of potential hacktivists, exploring two predicted covariates of attack decision making, namely, peer prevalence of hacking and sex differences. Contrary to expectations, results suggest that participants' estimations of attack likelihood stemmed solely from expected payoffs, rather than subjective risks. Peer prevalence significantly predicted increased payoffs and attack likelihood, suggesting an underlying descriptive norm in social networks. Notably, we observed no sex differences in the decision to attack, nor in the factors predicting attack likelihood. Implications for policymakers and the understanding and prevention of hacktivism are discussed, as are the possible ramifications of widely communicated payoffs over potential risks in hacking communities.
Maximum-likelihood fitting of data dominated by Poisson statistical uncertainties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoneking, M.R.; Den Hartog, D.J.
1996-06-01
The fitting of data by {chi}{sup 2}-minimization is valid only when the uncertainties in the data are normally distributed. When analyzing spectroscopic or particle counting data at very low signal level (e.g., a Thomson scattering diagnostic), the uncertainties are distributed with a Poisson distribution. The authors have developed a maximum-likelihood method for fitting data that correctly treats the Poisson statistical character of the uncertainties. This method maximizes the total probability that the observed data are drawn from the assumed fit function using the Poisson probability function to determine the probability for each data point. The algorithm also returns uncertainty estimatesmore » for the fit parameters. They compare this method with a {chi}{sup 2}-minimization routine applied to both simulated and real data. Differences in the returned fits are greater at low signal level (less than {approximately}20 counts per measurement). the maximum-likelihood method is found to be more accurate and robust, returning a narrower distribution of values for the fit parameters with fewer outliers.« less
The Inverse Problem for Confined Aquifer Flow: Identification and Estimation With Extensions
NASA Astrophysics Data System (ADS)
Loaiciga, Hugo A.; MariñO, Miguel A.
1987-01-01
The contributions of this work are twofold. First, a methodology for estimating the elements of parameter matrices in the governing equation of flow in a confined aquifer is developed. The estimation techniques for the distributed-parameter inverse problem pertain to linear least squares and generalized least squares methods. The linear relationship among the known heads and unknown parameters of the flow equation provides the background for developing criteria for determining the identifiability status of unknown parameters. Under conditions of exact or overidentification it is possible to develop statistically consistent parameter estimators and their asymptotic distributions. The estimation techniques, namely, two-stage least squares and three stage least squares, are applied to a specific groundwater inverse problem and compared between themselves and with an ordinary least squares estimator. The three-stage estimator provides the closer approximation to the actual parameter values, but it also shows relatively large standard errors as compared to the ordinary and two-stage estimators. The estimation techniques provide the parameter matrices required to simulate the unsteady groundwater flow equation. Second, a nonlinear maximum likelihood estimation approach to the inverse problem is presented. The statistical properties of maximum likelihood estimators are derived, and a procedure to construct confidence intervals and do hypothesis testing is given. The relative merits of the linear and maximum likelihood estimators are analyzed. Other topics relevant to the identification and estimation methodologies, i.e., a continuous-time solution to the flow equation, coping with noise-corrupted head measurements, and extension of the developed theory to nonlinear cases are also discussed. A simulation study is used to evaluate the methods developed in this study.
Comparing Three Estimation Methods for the Three-Parameter Logistic IRT Model
ERIC Educational Resources Information Center
Lamsal, Sunil
2015-01-01
Different estimation procedures have been developed for the unidimensional three-parameter item response theory (IRT) model. These techniques include the marginal maximum likelihood estimation, the fully Bayesian estimation using Markov chain Monte Carlo simulation techniques, and the Metropolis-Hastings Robbin-Monro estimation. With each…
Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood
Bondell, Howard D.; Stefanski, Leonard A.
2013-01-01
Large- and finite-sample efficiency and resistance to outliers are the key goals of robust statistics. Although often not simultaneously attainable, we develop and study a linear regression estimator that comes close. Efficiency obtains from the estimator’s close connection to generalized empirical likelihood, and its favorable robustness properties are obtained by constraining the associated sum of (weighted) squared residuals. We prove maximum attainable finite-sample replacement breakdown point, and full asymptotic efficiency for normal errors. Simulation evidence shows that compared to existing robust regression estimators, the new estimator has relatively high efficiency for small sample sizes, and comparable outlier resistance. The estimator is further illustrated and compared to existing methods via application to a real data set with purported outliers. PMID:23976805
NASA Technical Reports Server (NTRS)
Grove, R. D.; Mayhew, S. C.
1973-01-01
A computer program (Langley program C1123) has been developed for estimating aircraft stability and control parameters from flight test data. These parameters are estimated by the maximum likelihood estimation procedure implemented on a real-time digital simulation system, which uses the Control Data 6600 computer. This system allows the investigator to interact with the program in order to obtain satisfactory results. Part of this system, the control and display capabilities, is described for this program. This report also describes the computer program by presenting the program variables, subroutines, flow charts, listings, and operational features. Program usage is demonstrated with a test case using pseudo or simulated flight data.
Stability region maximization by decomposition-aggregation method. [Skylab stability
NASA Technical Reports Server (NTRS)
Siljak, D. D.; Cuk, S. M.
1974-01-01
This work is to improve the estimates of the stability regions by formulating and resolving a proper maximization problem. The solution of the problem provides the best estimate of the maximal value of the structural parameter and at the same time yields the optimum comparison system, which can be used to determine the degree of stability of the Skylab. The analysis procedure is completely computerized, resulting in a flexible and powerful tool for stability considerations of large-scale linear as well as nonlinear systems.
Zee, Jarcy; Xie, Sharon X.
2015-01-01
Summary When a true survival endpoint cannot be assessed for some subjects, an alternative endpoint that measures the true endpoint with error may be collected, which often occurs when obtaining the true endpoint is too invasive or costly. We develop an estimated likelihood function for the situation where we have both uncertain endpoints for all participants and true endpoints for only a subset of participants. We propose a nonparametric maximum estimated likelihood estimator of the discrete survival function of time to the true endpoint. We show that the proposed estimator is consistent and asymptotically normal. We demonstrate through extensive simulations that the proposed estimator has little bias compared to the naïve Kaplan-Meier survival function estimator, which uses only uncertain endpoints, and more efficient with moderate missingness compared to the complete-case Kaplan-Meier survival function estimator, which uses only available true endpoints. Finally, we apply the proposed method to a dataset for estimating the risk of developing Alzheimer's disease from the Alzheimer's Disease Neuroimaging Initiative. PMID:25916510
Maximum likelihood phase-retrieval algorithm: applications.
Nahrstedt, D A; Southwell, W H
1984-12-01
The maximum likelihood estimator approach is shown to be effective in determining the wave front aberration in systems involving laser and flow field diagnostics and optical testing. The robustness of the algorithm enables convergence even in cases of severe wave front error and real, nonsymmetrical, obscured amplitude distributions.
Cramer-Rao Bound, MUSIC, and Maximum Likelihood. Effects of Temporal Phase Difference
1990-11-01
Technical Report 1373 November 1990 Cramer-Rao Bound, MUSIC , And Maximum Likelihood Effects of Temporal Phase o Difference C. V. TranI OTIC Approved... MUSIC , and Maximum Likelihood (ML) asymptotic variances corresponding to the two-source direction-of-arrival estimation where sources were modeled as...1pI = 1.00, SNR = 20 dB ..................................... 27 2. MUSIC for two equipowered signals impinging on a 5-element ULA (a) IpI = 0.50, SNR
Bayesian experimental design for models with intractable likelihoods.
Drovandi, Christopher C; Pettitt, Anthony N
2013-12-01
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Youngrok
2013-05-15
Heterogeneity exists on a data set when samples from di erent classes are merged into the data set. Finite mixture models can be used to represent a survival time distribution on heterogeneous patient group by the proportions of each class and by the survival time distribution within each class as well. The heterogeneous data set cannot be explicitly decomposed to homogeneous subgroups unless all the samples are precisely labeled by their origin classes; such impossibility of decomposition is a barrier to overcome for estimating nite mixture models. The expectation-maximization (EM) algorithm has been used to obtain maximum likelihood estimates ofmore » nite mixture models by soft-decomposition of heterogeneous samples without labels for a subset or the entire set of data. In medical surveillance databases we can find partially labeled data, that is, while not completely unlabeled there is only imprecise information about class values. In this study we propose new EM algorithms that take advantages of using such partial labels, and thus incorporate more information than traditional EM algorithms. We particularly propose four variants of the EM algorithm named EM-OCML, EM-PCML, EM-HCML and EM-CPCML, each of which assumes a specific mechanism of missing class values. We conducted a simulation study on exponential survival trees with five classes and showed that the advantages of incorporating substantial amount of partially labeled data can be highly signi cant. We also showed model selection based on AIC values fairly works to select the best proposed algorithm on each specific data set. A case study on a real-world data set of gastric cancer provided by Surveillance, Epidemiology and End Results (SEER) program showed a superiority of EM-CPCML to not only the other proposed EM algorithms but also conventional supervised, unsupervised and semi-supervised learning algorithms.« less
Genovesio, Auguste; Liedl, Tim; Emiliani, Valentina; Parak, Wolfgang J; Coppey-Moisan, Maité; Olivo-Marin, Jean-Christophe
2006-05-01
We propose a method to detect and track multiple moving biological spot-like particles showing different kinds of dynamics in image sequences acquired through multidimensional fluorescence microscopy. It enables the extraction and analysis of information such as number, position, speed, movement, and diffusion phases of, e.g., endosomal particles. The method consists of several stages. After a detection stage performed by a three-dimensional (3-D) undecimated wavelet transform, we compute, for each detected spot, several predictions of its future state in the next frame. This is accomplished thanks to an interacting multiple model (IMM) algorithm which includes several models corresponding to different biologically realistic movement types. Tracks are constructed, thereafter, by a data association algorithm based on the maximization of the likelihood of each IMM. The last stage consists of updating the IMM filters in order to compute final estimations for the present image and to improve predictions for the next image. The performances of the method are validated on synthetic image data and used to characterize the 3-D movement of endocytic vesicles containing quantum dots.
A segmentation/clustering model for the analysis of array CGH data.
Picard, F; Robin, S; Lebarbier, E; Daudin, J-J
2007-09-01
Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.
Cytoprophet: a Cytoscape plug-in for protein and domain interaction networks inference.
Morcos, Faruck; Lamanna, Charles; Sikora, Marcin; Izaguirre, Jesús
2008-10-01
Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. http://cytoprophet.cse.nd.edu.
Mismatch removal via coherent spatial relations
NASA Astrophysics Data System (ADS)
Chen, Jun; Ma, Jiayi; Yang, Changcai; Tian, Jinwen
2014-07-01
We propose a method for removing mismatches from the given putative point correspondences in image pairs based on "coherent spatial relations." Under the Bayesian framework, we formulate our approach as a maximum likelihood problem and solve a coherent spatial relation between the putative point correspondences using an expectation-maximization (EM) algorithm. Our approach associates each point correspondence with a latent variable indicating it as being either an inlier or an outlier, and alternatively estimates the inlier set and recovers the coherent spatial relation. It can handle not only the case of image pairs with rigid motions but also the case of image pairs with nonrigid motions. To parameterize the coherent spatial relation, we choose two-view geometry and thin-plate spline as models for rigid and nonrigid cases, respectively. The mismatches could be successfully removed via the coherent spatial relations after the EM algorithm converges. The quantitative results on various experimental data demonstrate that our method outperforms many state-of-the-art methods, it is not affected by low initial correct match percentages, and is robust to most geometric transformations including a large viewing angle, image rotation, and affine transformation.
Modeling and segmentation of intra-cochlear anatomy in conventional CT
NASA Astrophysics Data System (ADS)
Noble, Jack H.; Rutherford, Robert B.; Labadie, Robert F.; Majdani, Omid; Dawant, Benoit M.
2010-03-01
Cochlear implant surgery is a procedure performed to treat profound hearing loss. Since the cochlea is not visible in surgery, the physician uses anatomical landmarks to estimate the pose of the cochlea. Research has indicated that implanting the electrode in a particular cavity of the cochlea, the scala tympani, results in better hearing restoration. The success of the scala tympani implantation is largely dependent on the point of entry and angle of electrode insertion. Errors can occur due to the imprecise nature of landmark-based, manual navigation as well as inter-patient variations between scala tympani and the anatomical landmarks. In this work, we use point distribution models of the intra-cochlear anatomy to study the inter-patient variations between the cochlea and the typical anatomic landmarks, and we implement an active shape model technique to automatically localize intra-cochlear anatomy in conventional CT images, where intra-cochlear structures are not visible. This fully automatic segmentation could aid the surgeon to choose the point of entry and angle of approach to maximize the likelihood of scala tympani insertion, resulting in more substantial hearing restoration.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-04-06
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-01-01
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
GRID-BASED EXPLORATION OF COSMOLOGICAL PARAMETER SPACE WITH SNAKE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mikkelsen, K.; Næss, S. K.; Eriksen, H. K., E-mail: kristin.mikkelsen@astro.uio.no
2013-11-10
We present a fully parallelized grid-based parameter estimation algorithm for investigating multidimensional likelihoods called Snake, and apply it to cosmological parameter estimation. The basic idea is to map out the likelihood grid-cell by grid-cell according to decreasing likelihood, and stop when a certain threshold has been reached. This approach improves vastly on the 'curse of dimensionality' problem plaguing standard grid-based parameter estimation simply by disregarding grid cells with negligible likelihood. The main advantages of this method compared to standard Metropolis-Hastings Markov Chain Monte Carlo methods include (1) trivial extraction of arbitrary conditional distributions; (2) direct access to Bayesian evidences; (3)more » better sampling of the tails of the distribution; and (4) nearly perfect parallelization scaling. The main disadvantage is, as in the case of brute-force grid-based evaluation, a dependency on the number of parameters, N{sub par}. One of the main goals of the present paper is to determine how large N{sub par} can be, while still maintaining reasonable computational efficiency; we find that N{sub par} = 12 is well within the capabilities of the method. The performance of the code is tested by comparing cosmological parameters estimated using Snake and the WMAP-7 data with those obtained using CosmoMC, the current standard code in the field. We find fully consistent results, with similar computational expenses, but shorter wall time due to the perfect parallelization scheme.« less
Estimating Model Probabilities using Thermodynamic Markov Chain Monte Carlo Methods
NASA Astrophysics Data System (ADS)
Ye, M.; Liu, P.; Beerli, P.; Lu, D.; Hill, M. C.
2014-12-01
Markov chain Monte Carlo (MCMC) methods are widely used to evaluate model probability for quantifying model uncertainty. In a general procedure, MCMC simulations are first conducted for each individual model, and MCMC parameter samples are then used to approximate marginal likelihood of the model by calculating the geometric mean of the joint likelihood of the model and its parameters. It has been found the method of evaluating geometric mean suffers from the numerical problem of low convergence rate. A simple test case shows that even millions of MCMC samples are insufficient to yield accurate estimation of the marginal likelihood. To resolve this problem, a thermodynamic method is used to have multiple MCMC runs with different values of a heating coefficient between zero and one. When the heating coefficient is zero, the MCMC run is equivalent to a random walk MC in the prior parameter space; when the heating coefficient is one, the MCMC run is the conventional one. For a simple case with analytical form of the marginal likelihood, the thermodynamic method yields more accurate estimate than the method of using geometric mean. This is also demonstrated for a case of groundwater modeling with consideration of four alternative models postulated based on different conceptualization of a confining layer. This groundwater example shows that model probabilities estimated using the thermodynamic method are more reasonable than those obtained using the geometric method. The thermodynamic method is general, and can be used for a wide range of environmental problem for model uncertainty quantification.
WEIGHTED LIKELIHOOD ESTIMATION UNDER TWO-PHASE SAMPLING
Saegusa, Takumi; Wellner, Jon A.
2013-01-01
We develop asymptotic theory for weighted likelihood estimators (WLE) under two-phase stratified sampling without replacement. We also consider several variants of WLEs involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko–Cantelli theorem, a theorem for rates of convergence of M-estimators, and a Donsker theorem for the inverse probability weighted empirical processes under two-phase sampling and sampling without replacement at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a finite-dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is estimable either at regular or nonregular rates. We illustrate these results and methods in the Cox model with right censoring and interval censoring. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase. PMID:24563559
Maximum Likelihood Estimation of Nonlinear Structural Equation Models with Ignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Song, Xin-Yuan; Lee, John C. K.
2003-01-01
The existing maximum likelihood theory and its computer software in structural equation modeling are established on the basis of linear relationships among latent variables with fully observed data. However, in social and behavioral sciences, nonlinear relationships among the latent variables are important for establishing more meaningful models…
A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses
ERIC Educational Resources Information Center
Vasdekis, Vassilis G. S.; Cagnone, Silvia; Moustaki, Irini
2012-01-01
The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate…
Multilevel and Latent Variable Modeling with Composite Links and Exploded Likelihoods
ERIC Educational Resources Information Center
Rabe-Hesketh, Sophia; Skrondal, Anders
2007-01-01
Composite links and exploded likelihoods are powerful yet simple tools for specifying a wide range of latent variable models. Applications considered include survival or duration models, models for rankings, small area estimation with census information, models for ordinal responses, item response models with guessing, randomized response models,…
Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi
2015-07-01
Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
Estimating residual fault hitting rates by recapture sampling
NASA Technical Reports Server (NTRS)
Lee, Larry; Gupta, Rajan
1988-01-01
For the recapture debugging design introduced by Nayak (1988) the problem of estimating the hitting rates of the faults remaining in the system is considered. In the context of a conditional likelihood, moment estimators are derived and are shown to be asymptotically normal and fully efficient. Fixed sample properties of the moment estimators are compared, through simulation, with those of the conditional maximum likelihood estimators. Properties of the conditional model are investigated such as the asymptotic distribution of linear functions of the fault hitting frequencies and a representation of the full data vector in terms of a sequence of independent random vectors. It is assumed that the residual hitting rates follow a log linear rate model and that the testing process is truncated when the gaps between the detection of new errors exceed a fixed amount of time.
Galili, Tal; Meilijson, Isaac
2016-01-02
The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Rosenblum, Michael; van der Laan, Mark J.
2010-01-01
Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
Dang, Cuong Cao; Lefort, Vincent; Le, Vinh Sy; Le, Quang Si; Gascuel, Olivier
2011-10-01
Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number of general purpose matrices have been proposed (e.g. JTT, WAG, LG) since the seminal work of Margaret Dayhoff and co-workers. However, it has been shown that matrices specific to certain protein groups (e.g. mitochondrial) or life domains (e.g. viruses) differ significantly from general average matrices, and thus perform better when applied to the data to which they are dedicated. This Web server implements the maximum-likelihood estimation procedure that was used to estimate LG, and provides a number of tools and facilities. Users upload a set of multiple protein alignments from their domain of interest and receive the resulting matrix by email, along with statistics and comparisons with other matrices. A non-parametric bootstrap is performed optionally to assess the variability of replacement rate estimates. Maximum-likelihood trees, inferred using the estimated rate matrix, are also computed optionally for each input alignment. Finely tuned procedures and up-to-date ML software (PhyML 3.0, XRATE) are combined to perform all these heavy calculations on our clusters. http://www.atgc-montpellier.fr/ReplacementMatrix/ olivier.gascuel@lirmm.fr Supplementary data are available at http://www.atgc-montpellier.fr/ReplacementMatrix/
DOE Office of Scientific and Technical Information (OSTI.GOV)
West, R. Derek; Gunther, Jacob H.; Moon, Todd K.
In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
West, R. Derek; Gunther, Jacob H.; Moon, Todd K.
2016-12-01
In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
A general methodology for maximum likelihood inference from band-recovery data
Conroy, M.J.; Williams, B.K.
1984-01-01
A numerical procedure is described for obtaining maximum likelihood estimates and associated maximum likelihood inference from band- recovery data. The method is used to illustrate previously developed one-age-class band-recovery models, and is extended to new models, including the analysis with a covariate for survival rates and variable-time-period recovery models. Extensions to R-age-class band- recovery, mark-recapture models, and twice-yearly marking are discussed. A FORTRAN program provides computations for these models.
Program for Weibull Analysis of Fatigue Data
NASA Technical Reports Server (NTRS)
Krantz, Timothy L.
2005-01-01
A Fortran computer program has been written for performing statistical analyses of fatigue-test data that are assumed to be adequately represented by a two-parameter Weibull distribution. This program calculates the following: (1) Maximum-likelihood estimates of the Weibull distribution; (2) Data for contour plots of relative likelihood for two parameters; (3) Data for contour plots of joint confidence regions; (4) Data for the profile likelihood of the Weibull-distribution parameters; (5) Data for the profile likelihood of any percentile of the distribution; and (6) Likelihood-based confidence intervals for parameters and/or percentiles of the distribution. The program can account for tests that are suspended without failure (the statistical term for such suspension of tests is "censoring"). The analytical approach followed in this program for the software is valid for type-I censoring, which is the removal of unfailed units at pre-specified times. Confidence regions and intervals are calculated by use of the likelihood-ratio method.
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...
2014-10-16
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.
2014-01-01
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. PMID:25329157
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pražnikar, Jure; University of Primorska,; Turk, Dušan, E-mail: dusan.turk@ijs.si
2014-12-01
The maximum-likelihood free-kick target, which calculates model error estimates from the work set and a randomly displaced model, proved superior in the accuracy and consistency of refinement of crystal structures compared with the maximum-likelihood cross-validation target, which calculates error estimates from the test set and the unperturbed model. The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. Theymore » utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of R{sub free} or may leave it out completely.« less
Script-theory virtual case: A novel tool for education and research.
Hayward, Jake; Cheung, Amandy; Velji, Alkarim; Altarejos, Jenny; Gill, Peter; Scarfe, Andrew; Lewis, Melanie
2016-11-01
Context/Setting: The script theory of diagnostic reasoning proposes that clinicians evaluate cases in the context of an "illness script," iteratively testing internal hypotheses against new information eventually reaching a diagnosis. We present a novel tool for teaching diagnostic reasoning to undergraduate medical students based on an adaptation of script theory. We developed a virtual patient case that used clinically authentic audio and video, interactive three-dimensional (3D) body images, and a simulated electronic medical record. Next, we used interactive slide bars to record respondents' likelihood estimates of diagnostic possibilities at various stages of the case. Responses were dynamically compared to data from expert clinicians and peers. Comparative frequency distributions were presented to the learner and final diagnostic likelihood estimates were analyzed. Detailed student feedback was collected. Over two academic years, 322 students participated. Student diagnostic likelihood estimates were similar year to year, but were consistently different from expert clinician estimates. Student feedback was overwhelmingly positive: students found the case was novel, innovative, clinically authentic, and a valuable learning experience. We demonstrate the successful implementation of a novel approach to teaching diagnostic reasoning. Future study may delineate reasoning processes associated with differences between novice and expert responses.
Shen, Yi; Dai, Wei; Richards, Virginia M
2015-03-01
A MATLAB toolbox for the efficient estimation of the threshold, slope, and lapse rate of the psychometric function is described. The toolbox enables the efficient implementation of the updated maximum-likelihood (UML) procedure. The toolbox uses an object-oriented architecture for organizing the experimental variables and computational algorithms, which provides experimenters with flexibility in experimental design and data management. Descriptions of the UML procedure and the UML Toolbox are provided, followed by toolbox use examples. Finally, guidelines and recommendations of parameter configurations are given.
Wald Sequential Probability Ratio Test for Analysis of Orbital Conjunction Data
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis; Gold, Dara
2013-01-01
We propose a Wald Sequential Probability Ratio Test for analysis of commonly available predictions associated with spacecraft conjunctions. Such predictions generally consist of a relative state and relative state error covariance at the time of closest approach, under the assumption that prediction errors are Gaussian. We show that under these circumstances, the likelihood ratio of the Wald test reduces to an especially simple form, involving the current best estimate of collision probability, and a similar estimate of collision probability that is based on prior assumptions about the likelihood of collision.
NASA Astrophysics Data System (ADS)
De Santis, Alberto; Dellepiane, Umberto; Lucidi, Stefano
2012-11-01
In this paper we investigate the estimation problem for a model of the commodity prices. This model is a stochastic state space dynamical model and the problem unknowns are the state variables and the system parameters. Data are represented by the commodity spot prices, very seldom time series of Futures contracts are available for free. Both the system joint likelihood function (state variables and parameters) and the system marginal likelihood (the state variables are eliminated) function are addressed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kagie, Matthew J.; Lanterman, Aaron D.
2017-12-01
This paper addresses parameter estimation for an optical transient signal when the received data has been right-censored. We develop an expectation-maximization (EM) algorithm to estimate the amplitude of a Poisson intensity with a known shape in the presence of additive background counts, where the measurements are subject to saturation effects. We compare the results of our algorithm with those of an EM algorithm that is unaware of the censoring.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lemaire, H.; Barat, E.; Carrel, F.
In this work, we tested Maximum likelihood expectation-maximization (MLEM) algorithms optimized for gamma imaging applications on two recent coded mask gamma cameras. We respectively took advantage of the characteristics of the GAMPIX and Caliste HD-based gamma cameras: noise reduction thanks to mask/anti-mask procedure but limited energy resolution for GAMPIX, high energy resolution for Caliste HD. One of our short-term perspectives is the test of MAPEM algorithms integrating specific prior values for the data to reconstruct adapted to the gamma imaging topic. (authors)
Model Checking with Multi-Threaded IC3 Portfolios
2015-01-15
different runs varies randomly depending on the thread interleaving. The use of a portfolio of solvers to maximize the likelihood of a quick solution is...empirically show (cf. Sec. 5.2) that the predictions based on this formula have high accuracy. Note that each solver in the portfolio potentially searches...speedup of over 300. We also show that widening the proof search of ic3 by randomizing its SAT solver is not as effective as paral- lelization
Proportion estimation using prior cluster purities
NASA Technical Reports Server (NTRS)
Terrell, G. R. (Principal Investigator)
1980-01-01
The prior distribution of CLASSY component purities is studied, and this information incorporated into maximum likelihood crop proportion estimators. The method is tested on Transition Year spring small grain segments.
Hurdle models for multilevel zero-inflated data via h-likelihood.
Molas, Marek; Lesaffre, Emmanuel
2010-12-30
Count data often exhibit overdispersion. One type of overdispersion arises when there is an excess of zeros in comparison with the standard Poisson distribution. Zero-inflated Poisson and hurdle models have been proposed to perform a valid likelihood-based analysis to account for the surplus of zeros. Further, data often arise in clustered, longitudinal or multiple-membership settings. The proper analysis needs to reflect the design of a study. Typically random effects are used to account for dependencies in the data. We examine the h-likelihood estimation and inference framework for hurdle models with random effects for complex designs. We extend the h-likelihood procedures to fit hurdle models, thereby extending h-likelihood to truncated distributions. Two applications of the methodology are presented. Copyright © 2010 John Wiley & Sons, Ltd.
User's manual for MMLE3, a general FORTRAN program for maximum likelihood parameter estimation
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1980-01-01
A user's manual for the FORTRAN IV computer program MMLE3 is described. It is a maximum likelihood parameter estimation program capable of handling general bilinear dynamic equations of arbitrary order with measurement noise and/or state noise (process noise). The theory and use of the program is described. The basic MMLE3 program is quite general and, therefore, applicable to a wide variety of problems. The basic program can interact with a set of user written problem specific routines to simplify the use of the program on specific systems. A set of user routines for the aircraft stability and control derivative estimation problem is provided with the program.
Multilevel modeling of single-case data: A comparison of maximum likelihood and Bayesian estimation.
Moeyaert, Mariola; Rindskopf, David; Onghena, Patrick; Van den Noortgate, Wim
2017-12-01
The focus of this article is to describe Bayesian estimation, including construction of prior distributions, and to compare parameter recovery under the Bayesian framework (using weakly informative priors) and the maximum likelihood (ML) framework in the context of multilevel modeling of single-case experimental data. Bayesian estimation results were found similar to ML estimation results in terms of the treatment effect estimates, regardless of the functional form and degree of information included in the prior specification in the Bayesian framework. In terms of the variance component estimates, both the ML and Bayesian estimation procedures result in biased and less precise variance estimates when the number of participants is small (i.e., 3). By increasing the number of participants to 5 or 7, the relative bias is close to 5% and more precise estimates are obtained for all approaches, except for the inverse-Wishart prior using the identity matrix. When a more informative prior was added, more precise estimates for the fixed effects and random effects were obtained, even when only 3 participants were included. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Dorazio, R.M.; Rago, P.J.
1991-01-01
We simulated mark–recapture experiments to evaluate a method for estimating fishing mortality and migration rates of populations stratified at release and recovery. When fish released in two or more strata were recovered from different recapture strata in nearly the same proportions, conditional recapture probabilities were estimated outside the [0, 1] interval. The maximum likelihood estimates tended to be biased and imprecise when the patterns of recaptures produced extremely "flat" likelihood surfaces. Absence of bias was not guaranteed, however, in experiments where recapture rates could be estimated within the [0, 1] interval. Inadequate numbers of tag releases and recoveries also produced biased estimates, although the bias was easily detected by the high sampling variability of the estimates. A stratified tag–recapture experiment with sockeye salmon (Oncorhynchus nerka) was used to demonstrate procedures for analyzing data that produce biased estimates of recapture probabilities. An estimator was derived to examine the sensitivity of recapture rate estimates to assumed differences in natural and tagging mortality, tag loss, and incomplete reporting of tag recoveries.
Quasar microlensing models with constraints on the Quasar light curves
NASA Astrophysics Data System (ADS)
Tie, S. S.; Kochanek, C. S.
2018-01-01
Quasar microlensing analyses implicitly generate a model of the variability of the source quasar. The implied source variability may be unrealistic yet its likelihood is generally not evaluated. We used the damped random walk (DRW) model for quasar variability to evaluate the likelihood of the source variability and applied the revized algorithm to a microlensing analysis of the lensed quasar RX J1131-1231. We compared estimates of the size of the quasar disc and the average stellar mass of the lens galaxy with and without applying the DRW likelihoods for the source variability model and found no significant effect on the estimated physical parameters. The most likely explanation is that unreliastic source light-curve models are generally associated with poor microlensing fits that already make a negligible contribution to the probability distributions of the derived parameters.
Estimation of Multinomial Probabilities.
1978-11-01
1971) and Alam (1978) have shown that the maximum likelihood estimator is admissible with respect to the quadratic loss. Steinhaus (1957) and Trybula...appear). Johnson, B. Mck. (1971). On admissible estimators for certain fixed sample binomial populations. Ann. Math. Statist. 92, 1579-1587. Steinhaus , H
Fitting power-laws in empirical data with estimators that work for all exponents
Hanel, Rudolf; Corominas-Murtra, Bernat; Liu, Bo; Thurner, Stefan
2017-01-01
Most standard methods based on maximum likelihood (ML) estimates of power-law exponents can only be reliably used to identify exponents smaller than minus one. The argument that power laws are otherwise not normalizable, depends on the underlying sample space the data is drawn from, and is true only for sample spaces that are unbounded from above. Power-laws obtained from bounded sample spaces (as is the case for practically all data related problems) are always free of such limitations and maximum likelihood estimates can be obtained for arbitrary powers without restrictions. Here we first derive the appropriate ML estimator for arbitrary exponents of power-law distributions on bounded discrete sample spaces. We then show that an almost identical estimator also works perfectly for continuous data. We implemented this ML estimator and discuss its performance with previous attempts. We present a general recipe of how to use these estimators and present the associated computer codes. PMID:28245249
Fang, Yun; Wu, Hulin; Zhu, Li-Xing
2011-07-01
We propose a two-stage estimation method for random coefficient ordinary differential equation (ODE) models. A maximum pseudo-likelihood estimator (MPLE) is derived based on a mixed-effects modeling approach and its asymptotic properties for population parameters are established. The proposed method does not require repeatedly solving ODEs, and is computationally efficient although it does pay a price with the loss of some estimation efficiency. However, the method does offer an alternative approach when the exact likelihood approach fails due to model complexity and high-dimensional parameter space, and it can also serve as a method to obtain the starting estimates for more accurate estimation methods. In addition, the proposed method does not need to specify the initial values of state variables and preserves all the advantages of the mixed-effects modeling approach. The finite sample properties of the proposed estimator are studied via Monte Carlo simulations and the methodology is also illustrated with application to an AIDS clinical data set.
Cohn, Timothy A.
2005-01-01
This paper presents an adjusted maximum likelihood estimator (AMLE) that can be used to estimate fluvial transport of contaminants, like phosphorus, that are subject to censoring because of analytical detection limits. The AMLE is a generalization of the widely accepted minimum variance unbiased estimator (MVUE), and Monte Carlo experiments confirm that it shares essentially all of the MVUE's desirable properties, including high efficiency and negligible bias. In particular, the AMLE exhibits substantially less bias than alternative censored‐data estimators such as the MLE (Tobit) or the MLE followed by a jackknife. As with the MLE and the MVUE the AMLE comes close to achieving the theoretical Frechet‐Cramér‐Rao bounds on its variance. This paper also presents a statistical framework, applicable to both censored and complete data, for understanding and estimating the components of uncertainty associated with load estimates. This can serve to lower the cost and improve the efficiency of both traditional and real‐time water quality monitoring.
Gang, Grace J; Siewerdsen, Jeffrey H; Stayman, J Webster
2017-12-01
This paper presents a joint optimization of dynamic fluence field modulation (FFM) and regularization in quadratic penalized-likelihood reconstruction that maximizes a task-based imaging performance metric. We adopted a task-driven imaging framework for prospective designs of the imaging parameters. A maxi-min objective function was adopted to maximize the minimum detectability index ( ) throughout the image. The optimization algorithm alternates between FFM (represented by low-dimensional basis functions) and local regularization (including the regularization strength and directional penalty weights). The task-driven approach was compared with three FFM strategies commonly proposed for FBP reconstruction (as well as a task-driven TCM strategy) for a discrimination task in an abdomen phantom. The task-driven FFM assigned more fluence to less attenuating anteroposterior views and yielded approximately constant fluence behind the object. The optimal regularization was almost uniform throughout image. Furthermore, the task-driven FFM strategy redistribute fluence across detector elements in order to prescribe more fluence to the more attenuating central region of the phantom. Compared with all strategies, the task-driven FFM strategy not only improved minimum by at least 17.8%, but yielded higher over a large area inside the object. The optimal FFM was highly dependent on the amount of regularization, indicating the importance of a joint optimization. Sample reconstructions of simulated data generally support the performance estimates based on computed . The improvements in detectability show the potential of the task-driven imaging framework to improve imaging performance at a fixed dose, or, equivalently, to provide a similar level of performance at reduced dose.
Lindsey, J C; Ryan, L M
1994-01-01
The three-state illness-death model provides a useful way to characterize data from a rodent tumorigenicity experiment. Most parametrizations proposed recently in the literature assume discrete time for the death process and either discrete or continuous time for the tumor onset process. We compare these approaches with a third alternative that uses a piecewise continuous model on the hazards for tumor onset and death. All three models assume proportional hazards to characterize tumor lethality and the effect of dose on tumor onset and death rate. All of the models can easily be fitted using an Expectation Maximization (EM) algorithm. The piecewise continuous model is particularly appealing in this context because the complete data likelihood corresponds to a standard piecewise exponential model with tumor presence as a time-varying covariate. It can be shown analytically that differences between the parameter estimates given by each model are explained by varying assumptions about when tumor onsets, deaths, and sacrifices occur within intervals. The mixed-time model is seen to be an extension of the grouped data proportional hazards model [Mutat. Res. 24:267-278 (1981)]. We argue that the continuous-time model is preferable to the discrete- and mixed-time models because it gives reasonable estimates with relatively few intervals while still making full use of the available information. Data from the ED01 experiment illustrate the results. PMID:8187731
Unclassified Publications of Lincoln Laboratory, 1 January - 31 December 1990. Volume 16
1990-12-31
Apr. 1990 ADA223419 Hopped Communication Systems with Nonuniform Hopping Distributions 880 Bistatic Radar Cross Section of a Fenn, A.J. 2 May1990...EXPERIMENT JA-6241 MS-8424 LUNAR PERTURBATION MAXIMUM LIKELIHOOD ALGORITHM JA-6241 JA-6467 LWIR SPECTRAL BAND MAXIMUM LIKELIHOOD ESTIMATOR JA-6476 MS-8466
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods
ERIC Educational Resources Information Center
Zhong, Xiaoling; Yuan, Ke-Hai
2011-01-01
In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…
High-Dimensional Exploratory Item Factor Analysis by a Metropolis-Hastings Robbins-Monro Algorithm
ERIC Educational Resources Information Center
Cai, Li
2010-01-01
A Metropolis-Hastings Robbins-Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The…
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for ...
Berglund, Lars; Garmo, Hans; Lindbäck, Johan; Svärdsudd, Kurt; Zethelius, Björn
2008-09-30
The least-squares estimator of the slope in a simple linear regression model is biased towards zero when the predictor is measured with random error. A corrected slope may be estimated by adding data from a reliability study, which comprises a subset of subjects from the main study. The precision of this corrected slope depends on the design of the reliability study and estimator choice. Previous work has assumed that the reliability study constitutes a random sample from the main study. A more efficient design is to use subjects with extreme values on their first measurement. Previously, we published a variance formula for the corrected slope, when the correction factor is the slope in the regression of the second measurement on the first. In this paper we show that both designs improve by maximum likelihood estimation (MLE). The precision gain is explained by the inclusion of data from all subjects for estimation of the predictor's variance and by the use of the second measurement for estimation of the covariance between response and predictor. The gain of MLE enhances with stronger true relationship between response and predictor and with lower precision in the predictor measurements. We present a real data example on the relationship between fasting insulin, a surrogate marker, and true insulin sensitivity measured by a gold-standard euglycaemic insulin clamp, and simulations, where the behavior of profile-likelihood-based confidence intervals is examined. MLE was shown to be a robust estimator for non-normal distributions and efficient for small sample situations. Copyright (c) 2008 John Wiley & Sons, Ltd.
Estimation Methods for One-Parameter Testlet Models
ERIC Educational Resources Information Center
Jiao, Hong; Wang, Shudong; He, Wei
2013-01-01
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
Likelihood Ratios for Glaucoma Diagnosis Using Spectral Domain Optical Coherence Tomography
Lisboa, Renato; Mansouri, Kaweh; Zangwill, Linda M.; Weinreb, Robert N.; Medeiros, Felipe A.
2014-01-01
Purpose To present a methodology for calculating likelihood ratios for glaucoma diagnosis for continuous retinal nerve fiber layer (RNFL) thickness measurements from spectral domain optical coherence tomography (spectral-domain OCT). Design Observational cohort study. Methods 262 eyes of 187 patients with glaucoma and 190 eyes of 100 control subjects were included in the study. Subjects were recruited from the Diagnostic Innovations Glaucoma Study. Eyes with preperimetric and perimetric glaucomatous damage were included in the glaucoma group. The control group was composed of healthy eyes with normal visual fields from subjects recruited from the general population. All eyes underwent RNFL imaging with Spectralis spectral-domain OCT. Likelihood ratios for glaucoma diagnosis were estimated for specific global RNFL thickness measurements using a methodology based on estimating the tangents to the Receiver Operating Characteristic (ROC) curve. Results Likelihood ratios could be determined for continuous values of average RNFL thickness. Average RNFL thickness values lower than 86μm were associated with positive LRs, i.e., LRs greater than 1; whereas RNFL thickness values higher than 86μm were associated with negative LRs, i.e., LRs smaller than 1. A modified Fagan nomogram was provided to assist calculation of post-test probability of disease from the calculated likelihood ratios and pretest probability of disease. Conclusion The methodology allowed calculation of likelihood ratios for specific RNFL thickness values. By avoiding arbitrary categorization of test results, it potentially allows for an improved integration of test results into diagnostic clinical decision-making. PMID:23972303
A comprehensive assessment of collision likelihood in Geosynchronous Earth Orbit
NASA Astrophysics Data System (ADS)
Oltrogge, D. L.; Alfano, S.; Law, C.; Cacioni, A.; Kelso, T. S.
2018-06-01
Knowing the likelihood of collision for satellites operating in Geosynchronous Earth Orbit (GEO) is of extreme importance and interest to the global community and the operators of GEO spacecraft. Yet for all of its importance, a comprehensive assessment of GEO collision likelihood is difficult to do and has never been done. In this paper, we employ six independent and diverse assessment methods to estimate GEO collision likelihood. Taken in aggregate, this comprehensive assessment offer new insights into GEO collision likelihood that are within a factor of 3.5 of each other. These results are then compared to four collision and seven encounter rate estimates previously published. Collectively, these new findings indicate that collision likelihood in GEO is as much as four orders of magnitude higher than previously published by other researchers. Results indicate that a collision is likely to occur every 4 years for one satellite out of the entire GEO active satellite population against a 1 cm RSO catalogue, and every 50 years against a 20 cm RSO catalogue. Further, previous assertions that collision relative velocities are low (i.e., <1 km/s) in GEO are disproven, with some GEO relative velocities as high as 4 km/s identified. These new findings indicate that unless operators successfully mitigate this collision risk, the GEO orbital arc is and will remain at high risk of collision, with the potential for serious follow-on collision threats from post-collision debris when a substantial GEO collision occurs.
Coral reef fish populations can persist without immigration
Salles, Océane C.; Maynard, Jeffrey A.; Joannides, Marc; Barbu, Corentin M.; Saenz-Agudelo, Pablo; Almany, Glenn R.; Berumen, Michael L.; Thorrold, Simon R.; Jones, Geoffrey P.; Planes, Serge
2015-01-01
Determining the conditions under which populations may persist requires accurate estimates of demographic parameters, including immigration, local reproductive success, and mortality rates. In marine populations, empirical estimates of these parameters are rare, due at least in part to the pelagic dispersal stage common to most marine organisms. Here, we evaluate population persistence and turnover for a population of orange clownfish, Amphiprion percula, at Kimbe Island in Papua New Guinea. All fish in the population were sampled and genotyped on five occasions at 2-year intervals spanning eight years. The genetic data enabled estimates of reproductive success retained in the same population (reproductive success to self-recruitment), reproductive success exported to other subpopulations (reproductive success to local connectivity), and immigration and mortality rates of sub-adults and adults. Approximately 50% of the recruits were assigned to parents from the Kimbe Island population and this was stable through the sampling period. Stability in the proportion of local and immigrant settlers is likely due to: low annual mortality rates and stable egg production rates, and the short larval stages and sensory capacities of reef fish larvae. Biannual mortality rates ranged from 0.09 to 0.55 and varied significantly spatially. We used these data to parametrize a model that estimated the probability of the Kimbe Island population persisting in the absence of immigration. The Kimbe Island population was found to persist without significant immigration. Model results suggest the island population persists because the largest of the subpopulations are maintained due to having low mortality and high self-recruitment rates. Our results enable managers to appropriately target and scale actions to maximize persistence likelihood as disturbance frequencies increase. PMID:26582017
GDP and efficiency of Russian economy
NASA Astrophysics Data System (ADS)
Borodachev, Sergey M.
2018-01-01
The goal is to study GDP (gross domestic product) as an unobservable characteristic of the Russian national economy state on the basis of more reliable observed data on gross output (systems output) and final consumption (systems control). To do this, the dynamic Leontief model is presented in a system-like form and its parameters and GDP dynamics are estimated by the Kalman filter (KF). We consider that all previous year's investments affect the growth of the gross output by the next year. The weights of these investments in the sum are equal to unity and decrease in geometric progression. The estimation of the model parameters was carried out by the maximum likelihood method. The original data on the gross output and final consumption in the period from 1995 to 2015 years where taken from the Rosstat website, where maximally aggregated economy of Russia is reflected in the system of national accounts. The growth of direct costs and capital expenditures at gross output increase has been discovered, which indicates the extensive character of the development of the economy. Investments are being absorbed 2 - 4 years; any change of them causes a surge of commissioned fixed assets fluctuation with a period of 2 years. Then these parameter values were used in the KF to estimate the states of the system. The emerging tendency of the transition of GDP growth to its fall means that the rate of growth of final consumption is higher than the rate of GDP growth. In general, the behavior of the curve of Rosstat GDP obviously follows the declared investments, whereas in the present calculation it is closer to the behavior of final consumption. Estimated GDP and investments that really increased it were significantly less after the crisis of 2008-2009 years than officially published data.
Coral reef fish populations can persist without immigration.
Salles, Océane C; Maynard, Jeffrey A; Joannides, Marc; Barbu, Corentin M; Saenz-Agudelo, Pablo; Almany, Glenn R; Berumen, Michael L; Thorrold, Simon R; Jones, Geoffrey P; Planes, Serge
2015-11-22
Determining the conditions under which populations may persist requires accurate estimates of demographic parameters, including immigration, local reproductive success, and mortality rates. In marine populations, empirical estimates of these parameters are rare, due at least in part to the pelagic dispersal stage common to most marine organisms. Here, we evaluate population persistence and turnover for a population of orange clownfish, Amphiprion percula, at Kimbe Island in Papua New Guinea. All fish in the population were sampled and genotyped on five occasions at 2-year intervals spanning eight years. The genetic data enabled estimates of reproductive success retained in the same population (reproductive success to self-recruitment), reproductive success exported to other subpopulations (reproductive success to local connectivity), and immigration and mortality rates of sub-adults and adults. Approximately 50% of the recruits were assigned to parents from the Kimbe Island population and this was stable through the sampling period. Stability in the proportion of local and immigrant settlers is likely due to: low annual mortality rates and stable egg production rates, and the short larval stages and sensory capacities of reef fish larvae. Biannual mortality rates ranged from 0.09 to 0.55 and varied significantly spatially. We used these data to parametrize a model that estimated the probability of the Kimbe Island population persisting in the absence of immigration. The Kimbe Island population was found to persist without significant immigration. Model results suggest the island population persists because the largest of the subpopulations are maintained due to having low mortality and high self-recruitment rates. Our results enable managers to appropriately target and scale actions to maximize persistence likelihood as disturbance frequencies increase. © 2015 The Author(s).
MIXOR: a computer program for mixed-effects ordinal regression analysis.
Hedeker, D; Gibbons, R D
1996-03-01
MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.
Optimism following a tornado disaster.
Suls, Jerry; Rose, Jason P; Windschitl, Paul D; Smith, Andrew R
2013-05-01
Effects of exposure to a severe weather disaster on perceived future vulnerability were assessed in college students, local residents contacted through random-digit dialing, and community residents of affected versus unaffected neighborhoods. Students and community residents reported being less vulnerable than their peers at 1 month, 6 months, and 1 year after the disaster. In Studies 1 and 2, absolute risk estimates were more optimistic with time, whereas comparative vulnerability was stable. Residents of affected neighborhoods (Study 3), surprisingly, reported less comparative vulnerability and lower "gut-level" numerical likelihood estimates at 6 months, but later their estimates resembled the unaffected residents. Likelihood estimates (10%-12%), however, exceeded the 1% risk calculated by storm experts, and gut-level versus statistical-level estimates were more optimistic. Although people believed they had approximately a 1-in-10 chance of injury from future tornadoes (i.e., an overestimate), they thought their risk was lower than peers.