likelihood estimation procedure: Topics by Science.gov

Sample records for likelihood estimation procedure

The numerical evaluation of maximum-likelihood estimates of the parameters for a mixture of normal distributions from partially identified samples

NASA Technical Reports Server (NTRS)

Walker, H. F.

1976-01-01

Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate were considered. These equations suggest certain successive approximations iterative procedures for obtaining maximum likelihood estimates. The procedures, which are generalized steepest ascent (deflected gradient) procedures, contain those of Hosmer as a special case.
A Comparison of a Bayesian and a Maximum Likelihood Tailored Testing Procedure.

ERIC Educational Resources Information Center

McKinley, Robert L.; Reckase, Mark D.

A study was conducted to compare tailored testing procedures based on a Bayesian ability estimation technique and on a maximum likelihood ability estimation technique. The Bayesian tailored testing procedure selected items so as to minimize the posterior variance of the ability estimate distribution, while the maximum likelihood tailored testing…
On the existence of maximum likelihood estimates for presence-only data

USGS Publications Warehouse

Hefley, Trevor J.; Hooten, Mevin B.

2015-01-01

It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence-only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions

NASA Technical Reports Server (NTRS)

Peters, B. C., Jr.; Walker, H. F.

1975-01-01

A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions, Addendum

NASA Technical Reports Server (NTRS)

Peters, B. C., Jr.; Walker, H. F.

1975-01-01

New results and insights concerning a previously published iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions were discussed. It was shown that the procedure converges locally to the consistent maximum likelihood estimate as long as a specified parameter is bounded between two limits. Bound values were given to yield optimal local convergence.
An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions

NASA Technical Reports Server (NTRS)

Peters, B. C., Jr.; Walker, H. F.

1978-01-01

This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions, 2

NASA Technical Reports Server (NTRS)

Peters, B. C., Jr.; Walker, H. F.

1976-01-01

The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
The numerical evaluation of maximum-likelihood estimates of the parameters for a mixture of normal distributions from partially identified samples

NASA Technical Reports Server (NTRS)

Walker, H. F.

1976-01-01

Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
Effects of Estimation Bias on Multiple-Category Classification with an IRT-Based Adaptive Classification Procedure

ERIC Educational Resources Information Center

Yang, Xiangdong; Poggio, John C.; Glasnapp, Douglas R.

2006-01-01

The effects of five ability estimators, that is, maximum likelihood estimator, weighted likelihood estimator, maximum a posteriori, expected a posteriori, and Owen's sequential estimator, on the performances of the item response theory-based adaptive classification procedure on multiple categories were studied via simulations. The following…
Consistency of Rasch Model Parameter Estimation: A Simulation Study.

ERIC Educational Resources Information Center

van den Wollenberg, Arnold L.; And Others

1988-01-01

The unconditional--simultaneous--maximum likelihood (UML) estimation procedure for the one-parameter logistic model produces biased estimators. The UML method is inconsistent and is not a good alternative to conditional maximum likelihood method, at least with small numbers of items. The minimum Chi-square estimation procedure produces unbiased…
Models and analysis for multivariate failure time data

NASA Astrophysics Data System (ADS)

Shih, Joanna Huang

The goal of this research is to develop and investigate models and analytic methods for multivariate failure time data. We compare models in terms of direct modeling of the margins, flexibility of dependency structure, local vs. global measures of association, and ease of implementation. In particular, we study copula models, and models produced by right neutral cumulative hazard functions and right neutral hazard functions. We examine the changes of association over time for families of bivariate distributions induced from these models by displaying their density contour plots, conditional density plots, correlation curves of Doksum et al, and local cross ratios of Oakes. We know that bivariate distributions with same margins might exhibit quite different dependency structures. In addition to modeling, we study estimation procedures. For copula models, we investigate three estimation procedures. the first procedure is full maximum likelihood. The second procedure is two-stage maximum likelihood. At stage 1, we estimate the parameters in the margins by maximizing the marginal likelihood. At stage 2, we estimate the dependency structure by fixing the margins at the estimated ones. The third procedure is two-stage partially parametric maximum likelihood. It is similar to the second procedure, but we estimate the margins by the Kaplan-Meier estimate. We derive asymptotic properties for these three estimation procedures and compare their efficiency by Monte-Carlo simulations and direct computations. For models produced by right neutral cumulative hazards and right neutral hazards, we derive the likelihood and investigate the properties of the maximum likelihood estimates. Finally, we develop goodness of fit tests for the dependency structure in the copula models. We derive a test statistic and its asymptotic properties based on the test of homogeneity of Zelterman and Chen (1988), and a graphical diagnostic procedure based on the empirical Bayes approach. We study the performance of these two methods using actual and computer generated data.
Procedure for estimating stability and control parameters from flight test data by using maximum likelihood methods employing a real-time digital system

NASA Technical Reports Server (NTRS)

Grove, R. D.; Bowles, R. L.; Mayhew, S. C.

1972-01-01

A maximum likelihood parameter estimation procedure and program were developed for the extraction of the stability and control derivatives of aircraft from flight test data. Nonlinear six-degree-of-freedom equations describing aircraft dynamics were used to derive sensitivity equations for quasilinearization. The maximum likelihood function with quasilinearization was used to derive the parameter change equations, the covariance matrices for the parameters and measurement noise, and the performance index function. The maximum likelihood estimator was mechanized into an iterative estimation procedure utilizing a real time digital computer and graphic display system. This program was developed for 8 measured state variables and 40 parameters. Test cases were conducted with simulated data for validation of the estimation procedure and program. The program was applied to a V/STOL tilt wing aircraft, a military fighter airplane, and a light single engine airplane. The particular nonlinear equations of motion, derivation of the sensitivity equations, addition of accelerations into the algorithm, operational features of the real time digital system, and test cases are described.
Parameter estimation in astronomy through application of the likelihood ratio. [satellite data analysis techniques

NASA Technical Reports Server (NTRS)

Cash, W.

1979-01-01

Many problems in the experimental estimation of parameters for models can be solved through use of the likelihood ratio test. Applications of the likelihood ratio, with particular attention to photon counting experiments, are discussed. The procedures presented solve a greater range of problems than those currently in use, yet are no more difficult to apply. The procedures are proved analytically, and examples from current problems in astronomy are discussed.
Estimating multilevel logistic regression models when the number of clusters is low: a comparison of different statistical software procedures.

PubMed

Austin, Peter C

2010-04-22

Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Empirical best linear unbiased prediction method for small areas with restricted maximum likelihood and bootstrap procedure to estimate the average of household expenditure per capita in Banjar Regency

NASA Astrophysics Data System (ADS)

Aminah, Agustin Siti; Pawitan, Gandhi; Tantular, Bertho

2017-03-01

So far, most of the data published by Statistics Indonesia (BPS) as data providers for national statistics are still limited to the district level. Less sufficient sample size for smaller area levels to make the measurement of poverty indicators with direct estimation produced high standard error. Therefore, the analysis based on it is unreliable. To solve this problem, the estimation method which can provide a better accuracy by combining survey data and other auxiliary data is required. One method often used for the estimation is the Small Area Estimation (SAE). There are many methods used in SAE, one of them is Empirical Best Linear Unbiased Prediction (EBLUP). EBLUP method of maximum likelihood (ML) procedures does not consider the loss of degrees of freedom due to estimating β with β ^. This drawback motivates the use of the restricted maximum likelihood (REML) procedure. This paper proposed EBLUP with REML procedure for estimating poverty indicators by modeling the average of household expenditures per capita and implemented bootstrap procedure to calculate MSE (Mean Square Error) to compare the accuracy EBLUP method with the direct estimation method. Results show that EBLUP method reduced MSE in small area estimation.
Formulating the Rasch Differential Item Functioning Model under the Marginal Maximum Likelihood Estimation Context and Its Comparison with Mantel-Haenszel Procedure in Short Test and Small Sample Conditions

ERIC Educational Resources Information Center

Paek, Insu; Wilson, Mark

2011-01-01

This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
A Class of Factor Analysis Estimation Procedures with Common Asymptotic Sampling Properties

ERIC Educational Resources Information Center

Swain, A. J.

1975-01-01

Considers a class of estimation procedures for the factor model. The procedures are shown to yield estimates possessing the same asymptotic sampling properties as those from estimation by maximum likelihood or generalized last squares, both special members of the class. General expressions for the derivatives needed for Newton-Raphson…
Estimating the variance for heterogeneity in arm-based network meta-analysis.

PubMed

Piepho, Hans-Peter; Madden, Laurence V; Roger, James; Payne, Roger; Williams, Emlyn R

2018-04-19

Network meta-analysis can be implemented by using arm-based or contrast-based models. Here we focus on arm-based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial-by-treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi-likelihood/pseudo-likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi-likelihood/pseudo-likelihood and h-likelihood reduce bias and yield satisfactory coverage rates. Sum-to-zero restriction and baseline contrasts for random trial-by-treatment interaction effects, as well as a residual ML-like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi-likelihood/pseudo-likelihood and h-likelihood are therefore recommended. Copyright © 2018 John Wiley & Sons, Ltd.
A MATLAB toolbox for the efficient estimation of the psychometric function using the updated maximum-likelihood adaptive procedure.

PubMed

Shen, Yi; Dai, Wei; Richards, Virginia M

2015-03-01

A MATLAB toolbox for the efficient estimation of the threshold, slope, and lapse rate of the psychometric function is described. The toolbox enables the efficient implementation of the updated maximum-likelihood (UML) procedure. The toolbox uses an object-oriented architecture for organizing the experimental variables and computational algorithms, which provides experimenters with flexibility in experimental design and data management. Descriptions of the UML procedure and the UML Toolbox are provided, followed by toolbox use examples. Finally, guidelines and recommendations of parameter configurations are given.
Two models for evaluating landslide hazards

USGS Publications Warehouse

Davis, J.C.; Chung, C.-J.; Ohlmacher, G.C.

2006-01-01

Two alternative procedures for estimating landslide hazards were evaluated using data on topographic digital elevation models (DEMs) and bedrock lithologies in an area adjacent to the Missouri River in Atchison County, Kansas, USA. The two procedures are based on the likelihood ratio model but utilize different assumptions. The empirical likelihood ratio model is based on non-parametric empirical univariate frequency distribution functions under an assumption of conditional independence while the multivariate logistic discriminant model assumes that likelihood ratios can be expressed in terms of logistic functions. The relative hazards of occurrence of landslides were estimated by an empirical likelihood ratio model and by multivariate logistic discriminant analysis. Predictor variables consisted of grids containing topographic elevations, slope angles, and slope aspects calculated from a 30-m DEM. An integer grid of coded bedrock lithologies taken from digitized geologic maps was also used as a predictor variable. Both statistical models yield relative estimates in the form of the proportion of total map area predicted to already contain or to be the site of future landslides. The stabilities of estimates were checked by cross-validation of results from random subsamples, using each of the two procedures. Cell-by-cell comparisons of hazard maps made by the two models show that the two sets of estimates are virtually identical. This suggests that the empirical likelihood ratio and the logistic discriminant analysis models are robust with respect to the conditional independent assumption and the logistic function assumption, respectively, and that either model can be used successfully to evaluate landslide hazards. ?? 2006.

Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation

NASA Astrophysics Data System (ADS)

Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.

2016-12-01

With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.
An Improved Nested Sampling Algorithm for Model Selection and Assessment

NASA Astrophysics Data System (ADS)

Zeng, X.; Ye, M.; Wu, J.; WANG, D.

2017-12-01

Multimodel strategy is a general approach for treating model structure uncertainty in recent researches. The unknown groundwater system is represented by several plausible conceptual models. Each alternative conceptual model is attached with a weight which represents the possibility of this model. In Bayesian framework, the posterior model weight is computed as the product of model prior weight and marginal likelihood (or termed as model evidence). As a result, estimating marginal likelihoods is crucial for reliable model selection and assessment in multimodel analysis. Nested sampling estimator (NSE) is a new proposed algorithm for marginal likelihood estimation. The implementation of NSE comprises searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm and its variants are often used for local sampling in NSE. However, M-H is not an efficient sampling algorithm for high-dimensional or complex likelihood function. For improving the performance of NSE, it could be feasible to integrate more efficient and elaborated sampling algorithm - DREAMzs into the local sampling. In addition, in order to overcome the computation burden problem of large quantity of repeating model executions in marginal likelihood estimation, an adaptive sparse grid stochastic collocation method is used to build the surrogates for original groundwater model.
A MATLAB toolbox for the efficient estimation of the psychometric function using the updated maximum-likelihood adaptive procedure

PubMed Central

Richards, V. M.; Dai, W.

2014-01-01

A MATLAB toolbox for the efficient estimation of the threshold, slope, and lapse rate of the psychometric function is described. The toolbox enables the efficient implementation of the updated maximum-likelihood (UML) procedure. The toolbox uses an object-oriented architecture for organizing the experimental variables and computational algorithms, which provides experimenters with flexibility in experimental design and data management. Descriptions of the UML procedure and the UML Toolbox are provided, followed by toolbox use examples. Finally, guidelines and recommendations of parameter configurations are given. PMID:24671826
A baseline-free procedure for transformation models under interval censorship.

PubMed

Gu, Ming Gao; Sun, Liuquan; Zuo, Guoxin

2005-12-01

An important property of Cox regression model is that the estimation of regression parameters using the partial likelihood procedure does not depend on its baseline survival function. We call such a procedure baseline-free. Using marginal likelihood, we show that an baseline-free procedure can be derived for a class of general transformation models under interval censoring framework. The baseline-free procedure results a simplified and stable computation algorithm for some complicated and important semiparametric models, such as frailty models and heteroscedastic hazard/rank regression models, where the estimation procedures so far available involve estimation of the infinite dimensional baseline function. A detailed computational algorithm using Markov Chain Monte Carlo stochastic approximation is presented. The proposed procedure is demonstrated through extensive simulation studies, showing the validity of asymptotic consistency and normality. We also illustrate the procedure with a real data set from a study of breast cancer. A heuristic argument showing that the score function is a mean zero martingale is provided.
Estimation of descriptive statistics for multiply censored water quality data

USGS Publications Warehouse

Helsel, Dennis R.; Cohn, Timothy A.

1988-01-01

This paper extends the work of Gilliom and Helsel (1986) on procedures for estimating descriptive statistics of water quality data that contain “less than” observations. Previously, procedures were evaluated when only one detection limit was present. Here we investigate the performance of estimators for data that have multiple detection limits. Probability plotting and maximum likelihood methods perform substantially better than simple substitution procedures now commonly in use. Therefore simple substitution procedures (e.g., substitution of the detection limit) should be avoided. Probability plotting methods are more robust than maximum likelihood methods to misspecification of the parent distribution and their use should be encouraged in the typical situation where the parent distribution is unknown. When utilized correctly, less than values frequently contain nearly as much information for estimating population moments and quantiles as would the same observations had the detection limit been below them.
On non-parametric maximum likelihood estimation of the bivariate survivor function.

PubMed

Prentice, R L

The likelihood function for the bivariate survivor function F, under independent censorship, is maximized to obtain a non-parametric maximum likelihood estimator &Fcirc;. &Fcirc; may or may not be unique depending on the configuration of singly- and doubly-censored pairs. The likelihood function can be maximized by placing all mass on the grid formed by the uncensored failure times, or half lines beyond the failure time grid, or in the upper right quadrant beyond the grid. By accumulating the mass along lines (or regions) where the likelihood is flat, one obtains a partially maximized likelihood as a function of parameters that can be uniquely estimated. The score equations corresponding to these point mass parameters are derived, using a Lagrange multiplier technique to ensure unit total mass, and a modified Newton procedure is used to calculate the parameter estimates in some limited simulation studies. Some considerations for the further development of non-parametric bivariate survivor function estimators are briefly described.
Signal detection theory and vestibular perception: III. Estimating unbiased fit parameters for psychometric functions.

PubMed

Chaudhuri, Shomesh E; Merfeld, Daniel M

2013-03-01

Psychophysics generally relies on estimating a subject's ability to perform a specific task as a function of an observed stimulus. For threshold studies, the fitted functions are called psychometric functions. While fitting psychometric functions to data acquired using adaptive sampling procedures (e.g., "staircase" procedures), investigators have encountered a bias in the spread ("slope" or "threshold") parameter that has been attributed to the serial dependency of the adaptive data. Using simulations, we confirm this bias for cumulative Gaussian parametric maximum likelihood fits on data collected via adaptive sampling procedures, and then present a bias-reduced maximum likelihood fit that substantially reduces the bias without reducing the precision of the spread parameter estimate and without reducing the accuracy or precision of the other fit parameters. As a separate topic, we explain how to implement this bias reduction technique using generalized linear model fits as well as other numeric maximum likelihood techniques such as the Nelder-Mead simplex. We then provide a comparison of the iterative bootstrap and observed information matrix techniques for estimating parameter fit variance from adaptive sampling procedure data sets. The iterative bootstrap technique is shown to be slightly more accurate; however, the observed information technique executes in a small fraction (0.005 %) of the time required by the iterative bootstrap technique, which is an advantage when a real-time estimate of parameter fit variance is required.
PACM: A Two-Stage Procedure for Analyzing Structural Models.

ERIC Educational Resources Information Center

Lehmann, Donald R.; Gupta, Sunil

1989-01-01

Path Analysis of Covariance Matrix (PACM) is described as a way to separately estimate measurement and structural models using standard least squares procedures. PACM was empirically compared to simultaneous maximum likelihood estimation and use of the LISREL computer program, and its advantages are identified. (SLD)
Marginal Maximum A Posteriori Item Parameter Estimation for the Generalized Graded Unfolding Model

ERIC Educational Resources Information Center

Roberts, James S.; Thompson, Vanessa M.

2011-01-01

A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…
Maximum Likelihood as an Operational Tool in Socio-Economic Modeling : As Outlined in a Recent Thesis of D. W. Peterson

DOT National Transportation Integrated Search

1977-02-01

The limitations of currently used estimation procedures in socio-economic modeling have been highlighted in the ongoing work of Senge, in which it is shown where more sophisticated estimation procedures may become necessary. One such advanced method ...
Approximated maximum likelihood estimation in multifractal random walks

NASA Astrophysics Data System (ADS)

Løvsletten, O.; Rypdal, M.

2012-04-01

We present an approximated maximum likelihood method for the multifractal random walk processes of [E. Bacry , Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.64.026103 64, 026103 (2001)]. The likelihood is computed using a Laplace approximation and a truncation in the dependency structure for the latent volatility. The procedure is implemented as a package in the r computer language. Its performance is tested on synthetic data and compared to an inference approach based on the generalized method of moments. The method is applied to estimate parameters for various financial stock indices.
New robust statistical procedures for the polytomous logistic regression models.

PubMed

Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

2018-05-17

This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
A Model-Based Approach for Visualizing the Dimensional Structure of Ordered Successive Categories Preference Data

ERIC Educational Resources Information Center

DeSarbo, Wayne S.; Park, Joonwook; Scott, Crystal J.

2008-01-01

A cyclical conditional maximum likelihood estimation procedure is developed for the multidimensional unfolding of two- or three-way dominance data (e.g., preference, choice, consideration) measured on ordered successive category rating scales. The technical description of the proposed model and estimation procedure are discussed, as well as the…
Practical aspects of a maximum likelihood estimation method to extract stability and control derivatives from flight data

NASA Technical Reports Server (NTRS)

Iliff, K. W.; Maine, R. E.

1976-01-01

A maximum likelihood estimation method was applied to flight data and procedures to facilitate the routine analysis of a large amount of flight data were described. Techniques that can be used to obtain stability and control derivatives from aircraft maneuvers that are less than ideal for this purpose are described. The techniques involve detecting and correcting the effects of dependent or nearly dependent variables, structural vibration, data drift, inadequate instrumentation, and difficulties with the data acquisition system and the mathematical model. The use of uncertainty levels and multiple maneuver analysis also proved to be useful in improving the quality of the estimated coefficients. The procedures used for editing the data and for overall analysis are also discussed.
Comparing Three Estimation Methods for the Three-Parameter Logistic IRT Model

ERIC Educational Resources Information Center

Lamsal, Sunil

2015-01-01

Different estimation procedures have been developed for the unidimensional three-parameter item response theory (IRT) model. These techniques include the marginal maximum likelihood estimation, the fully Bayesian estimation using Markov chain Monte Carlo simulation techniques, and the Metropolis-Hastings Robbin-Monro estimation. With each…
Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

NASA Astrophysics Data System (ADS)

Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

2017-11-01

This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
Estimating Function Approaches for Spatial Point Processes

NASA Astrophysics Data System (ADS)

Deng, Chong

Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
Using the β-binomial distribution to characterize forest health

Treesearch

S.J. Zarnoch; R.L. Anderson; R.M. Sheffield

1995-01-01

The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
New method to incorporate Type B uncertainty into least-squares procedures in radionuclide metrology.

PubMed

Han, Jubong; Lee, K B; Lee, Jong-Man; Park, Tae Soon; Oh, J S; Oh, Pil-Jei

2016-03-01

We discuss a new method to incorporate Type B uncertainty into least-squares procedures. The new method is based on an extension of the likelihood function from which a conventional least-squares function is derived. The extended likelihood function is the product of the original likelihood function with additional PDFs (Probability Density Functions) that characterize the Type B uncertainties. The PDFs are considered to describe one's incomplete knowledge on correction factors being called nuisance parameters. We use the extended likelihood function to make point and interval estimations of parameters in the basically same way as the least-squares function used in the conventional least-squares method is derived. Since the nuisance parameters are not of interest and should be prevented from appearing in the final result, we eliminate such nuisance parameters by using the profile likelihood. As an example, we present a case study for a linear regression analysis with a common component of Type B uncertainty. In this example we compare the analysis results obtained from using our procedure with those from conventional methods. Copyright © 2015. Published by Elsevier Ltd.
Maximum likelihood estimation and EM algorithm of Copas-like selection model for publication bias correction.

PubMed

Ning, Jing; Chen, Yong; Piao, Jin

2017-07-01

Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Composite Partial Likelihood Estimation Under Length-Biased Sampling, With Application to a Prevalent Cohort Study of Dementia

PubMed Central

Huang, Chiung-Yu; Qin, Jing

2013-01-01

The Canadian Study of Health and Aging (CSHA) employed a prevalent cohort design to study survival after onset of dementia, where patients with dementia were sampled and the onset time of dementia was determined retrospectively. The prevalent cohort sampling scheme favors individuals who survive longer. Thus, the observed survival times are subject to length bias. In recent years, there has been a rising interest in developing estimation procedures for prevalent cohort survival data that not only account for length bias but also actually exploit the incidence distribution of the disease to improve efficiency. This article considers semiparametric estimation of the Cox model for the time from dementia onset to death under a stationarity assumption with respect to the disease incidence. Under the stationarity condition, the semiparametric maximum likelihood estimation is expected to be fully efficient yet difficult to perform for statistical practitioners, as the likelihood depends on the baseline hazard function in a complicated way. Moreover, the asymptotic properties of the semiparametric maximum likelihood estimator are not well-studied. Motivated by the composite likelihood method (Besag 1974), we develop a composite partial likelihood method that retains the simplicity of the popular partial likelihood estimator and can be easily performed using standard statistical software. When applied to the CSHA data, the proposed method estimates a significant difference in survival between the vascular dementia group and the possible Alzheimer’s disease group, while the partial likelihood method for left-truncated and right-censored data yields a greater standard error and a 95% confidence interval covering 0, thus highlighting the practical value of employing a more efficient methodology. To check the assumption of stable disease for the CSHA data, we also present new graphical and numerical tests in the article. The R code used to obtain the maximum composite partial likelihood estimator for the CSHA data is available in the online Supplementary Material, posted on the journal web site. PMID:24000265
ReplacementMatrix: a web server for maximum-likelihood estimation of amino acid replacement rate matrices.

PubMed

Dang, Cuong Cao; Lefort, Vincent; Le, Vinh Sy; Le, Quang Si; Gascuel, Olivier

2011-10-01

Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number of general purpose matrices have been proposed (e.g. JTT, WAG, LG) since the seminal work of Margaret Dayhoff and co-workers. However, it has been shown that matrices specific to certain protein groups (e.g. mitochondrial) or life domains (e.g. viruses) differ significantly from general average matrices, and thus perform better when applied to the data to which they are dedicated. This Web server implements the maximum-likelihood estimation procedure that was used to estimate LG, and provides a number of tools and facilities. Users upload a set of multiple protein alignments from their domain of interest and receive the resulting matrix by email, along with statistics and comparisons with other matrices. A non-parametric bootstrap is performed optionally to assess the variability of replacement rate estimates. Maximum-likelihood trees, inferred using the estimated rate matrix, are also computed optionally for each input alignment. Finely tuned procedures and up-to-date ML software (PhyML 3.0, XRATE) are combined to perform all these heavy calculations on our clusters. http://www.atgc-montpellier.fr/ReplacementMatrix/ olivier.gascuel@lirmm.fr Supplementary data are available at http://www.atgc-montpellier.fr/ReplacementMatrix/
Large Area Crop Inventory Experiment (LACIE). Development of procedure M for multicrop inventory, with tests of a spring-wheat configuration

NASA Technical Reports Server (NTRS)

Horvath, R. (Principal Investigator); Cicone, R.; Crist, E.; Kauth, R. J.; Lambeck, P.; Malila, W. A.; Richardson, W.

1979-01-01

The author has identified the following significant results. An outgrowth of research and development activities in support of LACIE was a multicrop area estimation procedure, Procedure M. This procedure was a flexible, modular system that could be operated within the LACIE framework. Its distinctive features were refined preprocessing (including spatially varying correction for atmospheric haze), definition of field like spatial features for labeling, spectral stratification, and unbiased selection of samples to label and crop area estimation without conventional maximum likelihood classification.
Standard and goodness-of-fit parameter estimation methods for the three-parameter lognormal distribution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kane, V.E.

1982-01-01

A class of goodness-of-fit estimators is found to provide a useful alternative in certain situations to the standard maximum likelihood method which has some undesirable estimation characteristics for estimation from the three-parameter lognormal distribution. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Filliben tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Robustness of the procedures are examined and example data sets analyzed.
A general methodology for maximum likelihood inference from band-recovery data

USGS Publications Warehouse

Conroy, M.J.; Williams, B.K.

1984-01-01

A numerical procedure is described for obtaining maximum likelihood estimates and associated maximum likelihood inference from band- recovery data. The method is used to illustrate previously developed one-age-class band-recovery models, and is extended to new models, including the analysis with a covariate for survival rates and variable-time-period recovery models. Extensions to R-age-class band- recovery, mark-recapture models, and twice-yearly marking are discussed. A FORTRAN program provides computations for these models.
Refinement of a Bias-Correction Procedure for the Weighted Likelihood Estimator of Ability. Research Report. ETS RR-07-23

ERIC Educational Resources Information Center

Zhang, Jinming; Lu, Ting

2007-01-01

In practical applications of item response theory (IRT), item parameters are usually estimated first from a calibration sample. After treating these estimates as fixed and known, ability parameters are then estimated. However, the statistical inferences based on the estimated abilities can be misleading if the uncertainty of the item parameter…
Comparing adaptive procedures for estimating the psychometric function for an auditory gap detection task.

PubMed

Shen, Yi

2013-05-01

A subject's sensitivity to a stimulus variation can be studied by estimating the psychometric function. Generally speaking, three parameters of the psychometric function are of interest: the performance threshold, the slope of the function, and the rate at which attention lapses occur. In the present study, three psychophysical procedures were used to estimate the three-parameter psychometric function for an auditory gap detection task. These were an up-down staircase (up-down) procedure, an entropy-based Bayesian (entropy) procedure, and an updated maximum-likelihood (UML) procedure. Data collected from four young, normal-hearing listeners showed that while all three procedures provided similar estimates of the threshold parameter, the up-down procedure performed slightly better in estimating the slope and lapse rate for 200 trials of data collection. When the lapse rate was increased by mixing in random responses for the three adaptive procedures, the larger lapse rate was especially detrimental to the efficiency of the up-down procedure, and the UML procedure provided better estimates of the threshold and slope than did the other two procedures.
A Bayesian approach to parameter and reliability estimation in the Poisson distribution.

NASA Technical Reports Server (NTRS)

Canavos, G. C.

1972-01-01

For life testing procedures, a Bayesian analysis is developed with respect to a random intensity parameter in the Poisson distribution. Bayes estimators are derived for the Poisson parameter and the reliability function based on uniform and gamma prior distributions of that parameter. A Monte Carlo procedure is implemented to make possible an empirical mean-squared error comparison between Bayes and existing minimum variance unbiased, as well as maximum likelihood, estimators. As expected, the Bayes estimators have mean-squared errors that are appreciably smaller than those of the other two.
A stochastic estimation procedure for intermittently-observed semi-Markov multistate models with back transitions.

PubMed

Aralis, Hilary; Brookmeyer, Ron

2017-01-01

Multistate models provide an important method for analyzing a wide range of life history processes including disease progression and patient recovery following medical intervention. Panel data consisting of the states occupied by an individual at a series of discrete time points are often used to estimate transition intensities of the underlying continuous-time process. When transition intensities depend on the time elapsed in the current state and back transitions between states are possible, this intermittent observation process presents difficulties in estimation due to intractability of the likelihood function. In this manuscript, we present an iterative stochastic expectation-maximization algorithm that relies on a simulation-based approximation to the likelihood function and implement this algorithm using rejection sampling. In a simulation study, we demonstrate the feasibility and performance of the proposed procedure. We then demonstrate application of the algorithm to a study of dementia, the Nun Study, consisting of intermittently-observed elderly subjects in one of four possible states corresponding to intact cognition, impaired cognition, dementia, and death. We show that the proposed stochastic expectation-maximization algorithm substantially reduces bias in model parameter estimates compared to an alternative approach used in the literature, minimal path estimation. We conclude that in estimating intermittently observed semi-Markov models, the proposed approach is a computationally feasible and accurate estimation procedure that leads to substantial improvements in back transition estimates.
A maximum likelihood algorithm for genome mapping of cytogenetic loci from meiotic configuration data.

PubMed Central

Reyes-Valdés, M H; Stelly, D M

1995-01-01

Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes. Images Fig. 1 PMID:7568226
Simple Penalties on Maximum-Likelihood Estimates of Genetic Parameters to Reduce Sampling Variation

PubMed Central

Meyer, Karin

2016-01-01

Multivariate estimates of genetic parameters are subject to substantial sampling variation, especially for smaller data sets and more than a few traits. A simple modification of standard, maximum-likelihood procedures for multivariate analyses to estimate genetic covariances is described, which can improve estimates by substantially reducing their sampling variances. This is achieved by maximizing the likelihood subject to a penalty. Borrowing from Bayesian principles, we propose a mild, default penalty—derived assuming a Beta distribution of scale-free functions of the covariance components to be estimated—rather than laboriously attempting to determine the stringency of penalization from the data. An extensive simulation study is presented, demonstrating that such penalties can yield very worthwhile reductions in loss, i.e., the difference from population values, for a wide range of scenarios and without distorting estimates of phenotypic covariances. Moreover, mild default penalties tend not to increase loss in difficult cases and, on average, achieve reductions in loss of similar magnitude to computationally demanding schemes to optimize the degree of penalization. Pertinent details required for the adaptation of standard algorithms to locate the maximum of the likelihood function are outlined. PMID:27317681
Two-part models with stochastic processes for modelling longitudinal semicontinuous data: Computationally efficient inference and modelling the overall marginal mean.

PubMed

Yiu, Sean; Tom, Brian Dm

2017-01-01

Several researchers have described two-part models with patient-specific stochastic processes for analysing longitudinal semicontinuous data. In theory, such models can offer greater flexibility than the standard two-part model with patient-specific random effects. However, in practice, the high dimensional integrations involved in the marginal likelihood (i.e. integrated over the stochastic processes) significantly complicates model fitting. Thus, non-standard computationally intensive procedures based on simulating the marginal likelihood have so far only been proposed. In this paper, we describe an efficient method of implementation by demonstrating how the high dimensional integrations involved in the marginal likelihood can be computed efficiently. Specifically, by using a property of the multivariate normal distribution and the standard marginal cumulative distribution function identity, we transform the marginal likelihood so that the high dimensional integrations are contained in the cumulative distribution function of a multivariate normal distribution, which can then be efficiently evaluated. Hence, maximum likelihood estimation can be used to obtain parameter estimates and asymptotic standard errors (from the observed information matrix) of model parameters. We describe our proposed efficient implementation procedure for the standard two-part model parameterisation and when it is of interest to directly model the overall marginal mean. The methodology is applied on a psoriatic arthritis data set concerning functional disability.
A method for modeling bias in a person's estimates of likelihoods of events

NASA Technical Reports Server (NTRS)

Nygren, Thomas E.; Morera, Osvaldo

1988-01-01

It is of practical importance in decision situations involving risk to train individuals to transform uncertainties into subjective probability estimates that are both accurate and unbiased. We have found that in decision situations involving risk, people often introduce subjective bias in their estimation of the likelihoods of events depending on whether the possible outcomes are perceived as being good or bad. Until now, however, the successful measurement of individual differences in the magnitude of such biases has not been attempted. In this paper we illustrate a modification of a procedure originally outlined by Davidson, Suppes, and Siegel (3) to allow for a quantitatively-based methodology for simultaneously estimating an individual's subjective utility and subjective probability functions. The procedure is now an interactive computer-based algorithm, DSS, that allows for the measurement of biases in probability estimation by obtaining independent measures of two subjective probability functions (S+ and S-) for winning (i.e., good outcomes) and for losing (i.e., bad outcomes) respectively for each individual, and for different experimental conditions within individuals. The algorithm and some recent empirical data are described.
Empirical Likelihood-Based Estimation of the Treatment Effect in a Pretest-Posttest Study.

PubMed

Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A

2008-09-01

The pretest-posttest study design is commonly used in medical and social science research to assess the effect of a treatment or an intervention. Recently, interest has been rising in developing inference procedures that improve efficiency while relaxing assumptions used in the pretest-posttest data analysis, especially when the posttest measurement might be missing. In this article we propose a semiparametric estimation procedure based on empirical likelihood (EL) that incorporates the common baseline covariate information to improve efficiency. The proposed method also yields an asymptotically unbiased estimate of the response distribution. Thus functions of the response distribution, such as the median, can be estimated straightforwardly, and the EL method can provide a more appealing estimate of the treatment effect for skewed data. We show that, compared with existing methods, the proposed EL estimator has appealing theoretical properties, especially when the working model for the underlying relationship between the pretest and posttest measurements is misspecified. A series of simulation studies demonstrates that the EL-based estimator outperforms its competitors when the working model is misspecified and the data are missing at random. We illustrate the methods by analyzing data from an AIDS clinical trial (ACTG 175).
Empirical Likelihood-Based Estimation of the Treatment Effect in a Pretest–Posttest Study

PubMed Central

Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.

2013-01-01

The pretest–posttest study design is commonly used in medical and social science research to assess the effect of a treatment or an intervention. Recently, interest has been rising in developing inference procedures that improve efficiency while relaxing assumptions used in the pretest–posttest data analysis, especially when the posttest measurement might be missing. In this article we propose a semiparametric estimation procedure based on empirical likelihood (EL) that incorporates the common baseline covariate information to improve efficiency. The proposed method also yields an asymptotically unbiased estimate of the response distribution. Thus functions of the response distribution, such as the median, can be estimated straightforwardly, and the EL method can provide a more appealing estimate of the treatment effect for skewed data. We show that, compared with existing methods, the proposed EL estimator has appealing theoretical properties, especially when the working model for the underlying relationship between the pretest and posttest measurements is misspecified. A series of simulation studies demonstrates that the EL-based estimator outperforms its competitors when the working model is misspecified and the data are missing at random. We illustrate the methods by analyzing data from an AIDS clinical trial (ACTG 175). PMID:23729942
Applied Missing Data Analysis. Methodology in the Social Sciences Series

ERIC Educational Resources Information Center

Enders, Craig K.

2010-01-01

Walking readers step by step through complex concepts, this book translates missing data techniques into something that applied researchers and graduate students can understand and utilize in their own research. Enders explains the rationale and procedural details for maximum likelihood estimation, Bayesian estimation, multiple imputation, and…
Group Comparisons in the Presence of Missing Data Using Latent Variable Modeling Techniques

ERIC Educational Resources Information Center

Raykov, Tenko; Marcoulides, George A.

2010-01-01

A latent variable modeling approach for examining population similarities and differences in observed variable relationship and mean indexes in incomplete data sets is discussed. The method is based on the full information maximum likelihood procedure of model fitting and parameter estimation. The procedure can be employed to test group identities…
A real-time digital program for estimating aircraft stability and control parameters from flight test data by using the maximum likelihood method

NASA Technical Reports Server (NTRS)

Grove, R. D.; Mayhew, S. C.

1973-01-01

A computer program (Langley program C1123) has been developed for estimating aircraft stability and control parameters from flight test data. These parameters are estimated by the maximum likelihood estimation procedure implemented on a real-time digital simulation system, which uses the Control Data 6600 computer. This system allows the investigator to interact with the program in order to obtain satisfactory results. Part of this system, the control and display capabilities, is described for this program. This report also describes the computer program by presenting the program variables, subroutines, flow charts, listings, and operational features. Program usage is demonstrated with a test case using pseudo or simulated flight data.
Estimating Interaction Effects With Incomplete Predictor Variables

PubMed Central

Enders, Craig K.; Baraldi, Amanda N.; Cham, Heining

2014-01-01

The existing missing data literature does not provide a clear prescription for estimating interaction effects with missing data, particularly when the interaction involves a pair of continuous variables. In this article, we describe maximum likelihood and multiple imputation procedures for this common analysis problem. We outline 3 latent variable model specifications for interaction analyses with missing data. These models apply procedures from the latent variable interaction literature to analyses with a single indicator per construct (e.g., a regression analysis with scale scores). We also discuss multiple imputation for interaction effects, emphasizing an approach that applies standard imputation procedures to the product of 2 raw score predictors. We thoroughly describe the process of probing interaction effects with maximum likelihood and multiple imputation. For both missing data handling techniques, we outline centering and transformation strategies that researchers can implement in popular software packages, and we use a series of real data analyses to illustrate these methods. Finally, we use computer simulations to evaluate the performance of the proposed techniques. PMID:24707955
PERIODIC AUTOREGRESSIVE-MOVING AVERAGE (PARMA) MODELING WITH APPLICATIONS TO WATER RESOURCES.

USGS Publications Warehouse

Vecchia, A.V.

1985-01-01

Results involving correlation properties and parameter estimation for autogressive-moving average models with periodic parameters are presented. A multivariate representation of the PARMA model is used to derive parameter space restrictions and difference equations for the periodic autocorrelations. Close approximation to the likelihood function for Gaussian PARMA processes results in efficient maximum-likelihood estimation procedures. Terms in the Fourier expansion of the parameters are sequentially included, and a selection criterion is given for determining the optimal number of harmonics to be included. Application of the techniques is demonstrated through analysis of a monthly streamflow time series.

Robust Gaussian Graphical Modeling via l1 Penalization

PubMed Central

Sun, Hokeun; Li, Hongzhe

2012-01-01

Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. PMID:23020775
Use of inequality constrained least squares estimation in small area estimation

NASA Astrophysics Data System (ADS)

Abeygunawardana, R. A. B.; Wickremasinghe, W. N.

2017-05-01

Traditional surveys provide estimates that are based only on the sample observations collected for the population characteristic of interest. However, these estimates may have unacceptably large variance for certain domains. Small Area Estimation (SAE) deals with determining precise and accurate estimates for population characteristics of interest for such domains. SAE usually uses least squares or maximum likelihood procedures incorporating prior information and current survey data. Many available methods in SAE use constraints in equality form. However there are practical situations where certain inequality restrictions on model parameters are more realistic. It will lead to Inequality Constrained Least Squares (ICLS) estimates if the method used is least squares. In this study ICLS estimation procedure is applied to many proposed small area estimates.
Fitting a three-parameter lognormal distribution with applications to hydrogeochemical data from the National Uranium Resource Evaluation Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kane, V.E.

1979-10-01

The standard maximum likelihood and moment estimation procedures are shown to have some undesirable characteristics for estimating the parameters in a three-parameter lognormal distribution. A class of goodness-of-fit estimators is found which provides a useful alternative to the standard methods. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Shapiro-Francia tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted-order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Bias and robustness of the procedures are examined and example data sets analyzed including geochemical datamore » from the National Uranium Resource Evaluation Program.« less
How to Deal with Interval-Censored Data Practically while Assessing the Progression-Free Survival: A Step-by-Step Guide Using SAS and R Software.

PubMed

Dugué, Audrey Emmanuelle; Pulido, Marina; Chabaud, Sylvie; Belin, Lisa; Gal, Jocelyn

2016-12-01

We describe how to estimate progression-free survival while dealing with interval-censored data in the setting of clinical trials in oncology. Three procedures with SAS and R statistical software are described: one allowing for a nonparametric maximum likelihood estimation of the survival curve using the EM-ICM (Expectation and Maximization-Iterative Convex Minorant) algorithm as described by Wellner and Zhan in 1997; a sensitivity analysis procedure in which the progression time is assigned (i) at the midpoint, (ii) at the upper limit (reflecting the standard analysis when the progression time is assigned at the first radiologic exam showing progressive disease), or (iii) at the lower limit of the censoring interval; and finally, two multiple imputations are described considering a uniform or the nonparametric maximum likelihood estimation (NPMLE) distribution. Clin Cancer Res; 22(23); 5629-35. ©2016 AACR. ©2016 American Association for Cancer Research.
The Role of Parametric Assumptions in Adaptive Bayesian Estimation

ERIC Educational Resources Information Center

Alcala-Quintana, Rocio; Garcia-Perez, Miguel A.

2004-01-01

Variants of adaptive Bayesian procedures for estimating the 5% point on a psychometric function were studied by simulation. Bias and standard error were the criteria to evaluate performance. The results indicated a superiority of (a) uniform priors, (b) model likelihood functions that are odd symmetric about threshold and that have parameter…
Logistic Achievement Test Scaling and Equating with Fixed versus Estimated Lower Asymptotes.

ERIC Educational Resources Information Center

Phillips, S. E.

This study compared the lower asymptotes estimated by the maximum likelihood procedures of the LOGIST computer program with those obtained via application of the Norton methodology. The study also compared the equating results from the three-parameter logistic model with those obtained from the equipercentile, Rasch, and conditional…
Parameter Estimation for a Model of Space-Time Rainfall

NASA Astrophysics Data System (ADS)

Smith, James A.; Karr, Alan F.

1985-08-01

In this paper, parameter estimation procedures, based on data from a network of rainfall gages, are developed for a class of space-time rainfall models. The models, which are designed to represent the spatial distribution of daily rainfall, have three components, one that governs the temporal occurrence of storms, a second that distributes rain cells spatially for a given storm, and a third that determines the rainfall pattern within a rain cell. Maximum likelihood and method of moments procedures are developed. We illustrate that limitations on model structure are imposed by restricting data sources to rain gage networks. The estimation procedures are applied to a 240-mi2 (621 km2) catchment in the Potomac River basin.
Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers

USGS Publications Warehouse

Runkel, Robert L.; Crawford, Charles G.; Cohn, Timothy A.

2004-01-01

LOAD ESTimator (LOADEST) is a FORTRAN program for estimating constituent loads in streams and rivers. Given a time series of streamflow, additional data variables, and constituent concentration, LOADEST assists the user in developing a regression model for the estimation of constituent load (calibration). Explanatory variables within the regression model include various functions of streamflow, decimal time, and additional user-specified data variables. The formulated regression model then is used to estimate loads over a user-specified time interval (estimation). Mean load estimates, standard errors, and 95 percent confidence intervals are developed on a monthly and(or) seasonal basis. The calibration and estimation procedures within LOADEST are based on three statistical estimation methods. The first two methods, Adjusted Maximum Likelihood Estimation (AMLE) and Maximum Likelihood Estimation (MLE), are appropriate when the calibration model errors (residuals) are normally distributed. Of the two, AMLE is the method of choice when the calibration data set (time series of streamflow, additional data variables, and concentration) contains censored data. The third method, Least Absolute Deviation (LAD), is an alternative to maximum likelihood estimation when the residuals are not normally distributed. LOADEST output includes diagnostic tests and warnings to assist the user in determining the appropriate estimation method and in interpreting the estimated loads. This report describes the development and application of LOADEST. Sections of the report describe estimation theory, input/output specifications, sample applications, and installation instructions.
Hurdle models for multilevel zero-inflated data via h-likelihood.

PubMed

Molas, Marek; Lesaffre, Emmanuel

2010-12-30

Count data often exhibit overdispersion. One type of overdispersion arises when there is an excess of zeros in comparison with the standard Poisson distribution. Zero-inflated Poisson and hurdle models have been proposed to perform a valid likelihood-based analysis to account for the surplus of zeros. Further, data often arise in clustered, longitudinal or multiple-membership settings. The proper analysis needs to reflect the design of a study. Typically random effects are used to account for dependencies in the data. We examine the h-likelihood estimation and inference framework for hurdle models with random effects for complex designs. We extend the h-likelihood procedures to fit hurdle models, thereby extending h-likelihood to truncated distributions. Two applications of the methodology are presented. Copyright © 2010 John Wiley & Sons, Ltd.
Statistical inference based on the nonparametric maximum likelihood estimator under double-truncation.

PubMed

Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi

2015-07-01

Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
An Example of an Improvable Rao-Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator.

PubMed

Galili, Tal; Meilijson, Isaac

2016-01-02

The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
Epidemiologic programs for computers and calculators. A microcomputer program for multiple logistic regression by unconditional and conditional maximum likelihood methods.

PubMed

Campos-Filho, N; Franco, E L

1989-02-01

A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information

NASA Technical Reports Server (NTRS)

Howell, L. W., Jr.

2003-01-01

A simple power law model consisting of a single spectral index, sigma(sub 2), is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index sigma(sub 2) greater than sigma(sub 1) above E(sub k). The maximum likelihood (ML) procedure was developed for estimating the single parameter sigma(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (Pl) consistency (asymptotically unbiased), (P2) efficiency (asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only be ascertained by calculating the CRB for an assumed energy spectrum- detector response function combination, which can be quite formidable in practice. However, the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are stained in practice are investigated.
Markov Chain Monte Carlo Estimation of Item Parameters for the Generalized Graded Unfolding Model

ERIC Educational Resources Information Center

de la Torre, Jimmy; Stark, Stephen; Chernyshenko, Oleksandr S.

2006-01-01

The authors present a Markov Chain Monte Carlo (MCMC) parameter estimation procedure for the generalized graded unfolding model (GGUM) and compare it to the marginal maximum likelihood (MML) approach implemented in the GGUM2000 computer program, using simulated and real personality data. In the simulation study, test length, number of response…
Iterative Procedures for Exact Maximum Likelihood Estimation in the First-Order Gaussian Moving Average Model

DTIC Science & Technology

1990-11-01

1 = Q- 1 - 1 QlaaQ- 1.1 + a’Q-1a This is a simple case of a general formula called Woodbury’s formula by some authors; see, for example, Phadke and...1 2. The First-Order Moving Average Model ..... .................. 3. Some Approaches to the Iterative...the approximate likelihood function in some time series models. Useful suggestions have been the Cholesky decomposition of the covariance matrix and
AN ASSESSMENT OF THE FATE OF METAL OXIDE NANOMATERIALS IN POROUS MEDIA

EPA Science Inventory

Developing procedures for assessing the potential environmental fate and transport of nanomaterials is an active endeavor of the environmental technical research community. Insufficient information exists for estimating the likelihood of nanomaterial deposition on natural surface...
Estimating cellular parameters through optimization procedures: elementary principles and applications.

PubMed

Kimura, Akatsuki; Celani, Antonio; Nagao, Hiromichi; Stasevich, Timothy; Nakamura, Kazuyuki

2015-01-01

Construction of quantitative models is a primary goal of quantitative biology, which aims to understand cellular and organismal phenomena in a quantitative manner. In this article, we introduce optimization procedures to search for parameters in a quantitative model that can reproduce experimental data. The aim of optimization is to minimize the sum of squared errors (SSE) in a prediction or to maximize likelihood. A (local) maximum of likelihood or (local) minimum of the SSE can efficiently be identified using gradient approaches. Addition of a stochastic process enables us to identify the global maximum/minimum without becoming trapped in local maxima/minima. Sampling approaches take advantage of increasing computational power to test numerous sets of parameters in order to determine the optimum set. By combining Bayesian inference with gradient or sampling approaches, we can estimate both the optimum parameters and the form of the likelihood function related to the parameters. Finally, we introduce four examples of research that utilize parameter optimization to obtain biological insights from quantified data: transcriptional regulation, bacterial chemotaxis, morphogenesis, and cell cycle regulation. With practical knowledge of parameter optimization, cell and developmental biologists can develop realistic models that reproduce their observations and thus, obtain mechanistic insights into phenomena of interest.
Bayesian estimation of the transmissivity spatial structure from pumping test data

NASA Astrophysics Data System (ADS)

Demir, Mehmet Taner; Copty, Nadim K.; Trinchero, Paolo; Sanchez-Vila, Xavier

2017-06-01

Estimating the statistical parameters (mean, variance, and integral scale) that define the spatial structure of the transmissivity or hydraulic conductivity fields is a fundamental step for the accurate prediction of subsurface flow and contaminant transport. In practice, the determination of the spatial structure is a challenge because of spatial heterogeneity and data scarcity. In this paper, we describe a novel approach that uses time drawdown data from multiple pumping tests to determine the transmissivity statistical spatial structure. The method builds on the pumping test interpretation procedure of Copty et al. (2011) (Continuous Derivation method, CD), which uses the time-drawdown data and its time derivative to estimate apparent transmissivity values as a function of radial distance from the pumping well. A Bayesian approach is then used to infer the statistical parameters of the transmissivity field by combining prior information about the parameters and the likelihood function expressed in terms of radially-dependent apparent transmissivities determined from pumping tests. A major advantage of the proposed Bayesian approach is that the likelihood function is readily determined from randomly generated multiple realizations of the transmissivity field, without the need to solve the groundwater flow equation. Applying the method to synthetically-generated pumping test data, we demonstrate that, through a relatively simple procedure, information on the spatial structure of the transmissivity may be inferred from pumping tests data. It is also shown that the prior parameter distribution has a significant influence on the estimation procedure, given the non-uniqueness of the estimation procedure. Results also indicate that the reliability of the estimated transmissivity statistical parameters increases with the number of available pumping tests.
Factors associated with adverse clinical outcomes among obstetric trainees

PubMed Central

Aiken PhD, Catherine E.; Aiken, Abigail; Park, Hannah; Brockelsby, Jeremy C.; Prentice, Andrew

2016-01-01

Objective To determine whether UK obstetric trainees transitioning from directly to indirectly-supervised practice have a higher likelihood of adverse patient outcomes from operative deliveries compared to other indirectly supervised trainees and to examine whether performing more procedures under direct supervision is associated with fewer adverse outcomes in initial indirect practice. Methods We examined all deliveries (13,861) conducted by obstetricians at a single centre over 5 years (2008-2013). Mixed-effects logistic regression models were used to compare estimated blood loss, maternal trauma, umbilical arterial pH, delayed neonatal respiration, failed instrumental delivery, and critical incidents for trainees in their first indirectly-supervised year with trainees in all other years of indirect practice. Outcomes for trainees in their first indirectly-supervised 3 months were compared to their outcomes for the remainder of the year. Linear regression was used to examine the relationship between number of procedures performed under direct supervision and initial outcomes under indirect supervision. Results Trainees in their first indirectly-supervised year had a higher likelihood of >2 litres estimated blood loss at any delivery (OR 1.32;CI(1.01-1.64) p<0.05) and of failed instrumental delivery (OR 2.33;CI(1.37-3.29) p<0.05) compared with other indirectly-supervised trainees. Other measured outcomes showed no significant differences. Within the first three months of indirect supervision, the likelihood of operative vaginal deliveries with >1litre estimated blood loss (OR 2.54;CI(1.88-3.20) p<0.05) was higher compared to the remainder of the first year. Performing more deliveries under direct supervision prior to beginning indirectly-supervised training was associated with decreased risk of >1litre estimated blood loss (p<0.05). Conclusions Obstetric trainees in their first year of indirectly-supervised practice have a higher likelihood of immediate adverse delivery outcomes, which are primarily maternal rather than neonatal. Undertaking more directly supervised procedures prior to transitioning to indirectly-supervised practice may reduce adverse outcomes, suggesting that experience is a key consideration in obstetric training programme design. PMID:26077215
Empirical Bayes Approaches to Multivariate Fuzzy Partitions.

ERIC Educational Resources Information Center

Woodbury, Max A.; Manton, Kenneth G.

1991-01-01

An empirical Bayes-maximum likelihood estimation procedure is presented for the application of fuzzy partition models in describing high dimensional discrete response data. The model describes individuals in terms of partial membership in multiple latent categories that represent bounded discrete spaces. (SLD)

Spatial scan statistics for detection of multiple clusters with arbitrary shapes.

PubMed

Lin, Pei-Sheng; Kung, Yi-Hung; Clayton, Murray

2016-12-01

In applying scan statistics for public health research, it would be valuable to develop a detection method for multiple clusters that accommodates spatial correlation and covariate effects in an integrated model. In this article, we connect the concepts of the likelihood ratio (LR) scan statistic and the quasi-likelihood (QL) scan statistic to provide a series of detection procedures sufficiently flexible to apply to clusters of arbitrary shape. First, we use an independent scan model for detection of clusters and then a variogram tool to examine the existence of spatial correlation and regional variation based on residuals of the independent scan model. When the estimate of regional variation is significantly different from zero, a mixed QL estimating equation is developed to estimate coefficients of geographic clusters and covariates. We use the Benjamini-Hochberg procedure (1995) to find a threshold for p-values to address the multiple testing problem. A quasi-deviance criterion is used to regroup the estimated clusters to find geographic clusters with arbitrary shapes. We conduct simulations to compare the performance of the proposed method with other scan statistics. For illustration, the method is applied to enterovirus data from Taiwan. © 2016, The International Biometric Society.
Mixed model approaches for diallel analysis based on a bio-model.

PubMed

Zhu, J; Weir, B S

1996-12-01

A MINQUE(1) procedure, which is minimum norm quadratic unbiased estimation (MINQUE) method with 1 for all the prior values, is suggested for estimating variance and covariance components in a bio-model for diallel crosses. Unbiasedness and efficiency of estimation were compared for MINQUE(1), restricted maximum likelihood (REML) and MINQUE theta which has parameter values for the prior values. MINQUE(1) is almost as efficient as MINQUE theta for unbiased estimation of genetic variance and covariance components. The bio-model is efficient and robust for estimating variance and covariance components for maternal and paternal effects as well as for nuclear effects. A procedure of adjusted unbiased prediction (AUP) is proposed for predicting random genetic effects in the bio-model. The jack-knife procedure is suggested for estimation of sampling variances of estimated variance and covariance components and of predicted genetic effects. Worked examples are given for estimation of variance and covariance components and for prediction of genetic merits.
Incorporating partially identified sample segments into acreage estimation procedures: Estimates using only observations from the current year

NASA Technical Reports Server (NTRS)

Sielken, R. L., Jr. (Principal Investigator)

1981-01-01

Several methods of estimating individual crop acreages using a mixture of completely identified and partially identified (generic) segments from a single growing year are derived and discussed. A small Monte Carlo study of eight estimators is presented. The relative empirical behavior of these estimators is discussed as are the effects of segment sample size and amount of partial identification. The principle recommendations are (1) to not exclude, but rather incorporate partially identified sample segments into the estimation procedure, (2) try to avoid having a large percentage (say 80%) of only partially identified segments, in the sample, and (3) use the maximum likelihood estimator although the weighted least squares estimator and least squares ratio estimator both perform almost as well. Sets of spring small grains (North Dakota) data were used.
Stochastic control system parameter identifiability

NASA Technical Reports Server (NTRS)

Lee, C. H.; Herget, C. J.

1975-01-01

The parameter identification problem of general discrete time, nonlinear, multiple input/multiple output dynamic systems with Gaussian white distributed measurement errors is considered. The knowledge of the system parameterization was assumed to be known. Concepts of local parameter identifiability and local constrained maximum likelihood parameter identifiability were established. A set of sufficient conditions for the existence of a region of parameter identifiability was derived. A computation procedure employing interval arithmetic was provided for finding the regions of parameter identifiability. If the vector of the true parameters is locally constrained maximum likelihood (CML) identifiable, then with probability one, the vector of true parameters is a unique maximal point of the maximum likelihood function in the region of parameter identifiability and the constrained maximum likelihood estimation sequence will converge to the vector of true parameters.
Methods of extending crop signatures from one area to another

NASA Technical Reports Server (NTRS)

Minter, T. C. (Principal Investigator)

1979-01-01

Efforts to develop a technology for signature extension during LACIE phases 1 and 2 are described. A number of haze and Sun angle correction procedures were developed and tested. These included the ROOSTER and OSCAR cluster-matching algorithms and their modifications, the MLEST and UHMLE maximum likelihood estimation procedures, and the ATCOR procedure. All these algorithms were tested on simulated data and consecutive-day LANDSAT imagery. The ATCOR, OSCAR, and MLEST algorithms were also tested for their capability to geographically extend signatures using LANDSAT imagery.
Multilevel modeling of single-case data: A comparison of maximum likelihood and Bayesian estimation.

PubMed

Moeyaert, Mariola; Rindskopf, David; Onghena, Patrick; Van den Noortgate, Wim

2017-12-01

The focus of this article is to describe Bayesian estimation, including construction of prior distributions, and to compare parameter recovery under the Bayesian framework (using weakly informative priors) and the maximum likelihood (ML) framework in the context of multilevel modeling of single-case experimental data. Bayesian estimation results were found similar to ML estimation results in terms of the treatment effect estimates, regardless of the functional form and degree of information included in the prior specification in the Bayesian framework. In terms of the variance component estimates, both the ML and Bayesian estimation procedures result in biased and less precise variance estimates when the number of participants is small (i.e., 3). By increasing the number of participants to 5 or 7, the relative bias is close to 5% and more precise estimates are obtained for all approaches, except for the inverse-Wishart prior using the identity matrix. When a more informative prior was added, more precise estimates for the fixed effects and random effects were obtained, even when only 3 participants were included. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Evaluation of a mark-recapture method for estimating mortality and migration rates of stratified populations

USGS Publications Warehouse

Dorazio, R.M.; Rago, P.J.

1991-01-01

We simulated mark–recapture experiments to evaluate a method for estimating fishing mortality and migration rates of populations stratified at release and recovery. When fish released in two or more strata were recovered from different recapture strata in nearly the same proportions, conditional recapture probabilities were estimated outside the [0, 1] interval. The maximum likelihood estimates tended to be biased and imprecise when the patterns of recaptures produced extremely "flat" likelihood surfaces. Absence of bias was not guaranteed, however, in experiments where recapture rates could be estimated within the [0, 1] interval. Inadequate numbers of tag releases and recoveries also produced biased estimates, although the bias was easily detected by the high sampling variability of the estimates. A stratified tag–recapture experiment with sockeye salmon (Oncorhynchus nerka) was used to demonstrate procedures for analyzing data that produce biased estimates of recapture probabilities. An estimator was derived to examine the sensitivity of recapture rate estimates to assumed differences in natural and tagging mortality, tag loss, and incomplete reporting of tag recoveries.
Maximum likelihood estimation for semiparametric transformation models with interval-censored data

PubMed Central

Mao, Lu; Lin, D. Y.

2016-01-01

Abstract Interval censoring arises frequently in clinical, epidemiological, financial and sociological studies, where the event or failure of interest is known only to occur within an interval induced by periodic monitoring. We formulate the effects of potentially time-dependent covariates on the interval-censored failure time through a broad class of semiparametric transformation models that encompasses proportional hazards and proportional odds models. We consider nonparametric maximum likelihood estimation for this class of models with an arbitrary number of monitoring times for each subject. We devise an EM-type algorithm that converges stably, even in the presence of time-dependent covariates, and show that the estimators for the regression parameters are consistent, asymptotically normal, and asymptotically efficient with an easily estimated covariance matrix. Finally, we demonstrate the performance of our procedures through simulation studies and application to an HIV/AIDS study conducted in Thailand. PMID:27279656
A comparison of minimum distance and maximum likelihood techniques for proportion estimation

NASA Technical Reports Server (NTRS)

Woodward, W. A.; Schucany, W. R.; Lindsey, H.; Gray, H. L.

1982-01-01

The estimation of mixing proportions P sub 1, P sub 2,...P sub m in the mixture density f(x) = the sum of the series P sub i F sub i(X) with i = 1 to M is often encountered in agricultural remote sensing problems in which case the p sub i's usually represent crop proportions. In these remote sensing applications, component densities f sub i(x) have typically been assumed to be normally distributed, and parameter estimation has been accomplished using maximum likelihood (ML) techniques. Minimum distance (MD) estimation is examined as an alternative to ML where, in this investigation, both procedures are based upon normal components. Results indicate that ML techniques are superior to MD when component distributions actually are normal, while MD estimation provides better estimates than ML under symmetric departures from normality. When component distributions are not symmetric, however, it is seen that neither of these normal based techniques provides satisfactory results.
A Statistical Test for Comparing Nonnested Covariance Structure Models.

ERIC Educational Resources Information Center

Levy, Roy; Hancock, Gregory R.

While statistical procedures are well known for comparing hierarchically related (nested) covariance structure models, statistical tests for comparing nonhierarchically related (nonnested) models have proven more elusive. While isolated attempts have been made, none exists within the commonly used maximum likelihood estimation framework, thereby…
Exploiting Non-sequence Data in Dynamic Model Learning

DTIC Science & Technology

2013-10-01

For our experiments here and in Section 3.5, we implement the proposed algorithms in MATLAB and use the maximum directed spanning tree solver...embarrassingly parallelizable, whereas PM’s maximum directed spanning tree procedure is harder to parallelize. In this experiment, our MATLAB ...some estimation problems, this approach is able to give unique and consistent estimates while the maximum- likelihood method gets entangled in
Absolute magnitude calibration using trigonometric parallax - Incomplete, spectroscopic samples

NASA Technical Reports Server (NTRS)

Ratnatunga, Kavan U.; Casertano, Stefano

1991-01-01

A new numerical algorithm is used to calibrate the absolute magnitude of spectroscopically selected stars from their observed trigonometric parallax. This procedure, based on maximum-likelihood estimation, can retrieve unbiased estimates of the intrinsic absolute magnitude and its dispersion even from incomplete samples suffering from selection biases in apparent magnitude and color. It can also make full use of low accuracy and negative parallaxes and incorporate censorship on reported parallax values. Accurate error estimates are derived for each of the fitted parameters. The algorithm allows an a posteriori check of whether the fitted model gives a good representation of the observations. The procedure is described in general and applied to both real and simulated data.
Maximum likelihood estimation in calibrating a stereo camera setup.

PubMed

Muijtjens, A M; Roos, J M; Arts, T; Hasman, A

1999-02-01

Motion and deformation of the cardiac wall may be measured by following the positions of implanted radiopaque markers in three dimensions, using two x-ray cameras simultaneously. Regularly, calibration of the position measurement system is obtained by registration of the images of a calibration object, containing 10-20 radiopaque markers at known positions. Unfortunately, an accidental change of the position of a camera after calibration requires complete recalibration. Alternatively, redundant information in the measured image positions of stereo pairs can be used for calibration. Thus, a separate calibration procedure can be avoided. In the current study a model is developed that describes the geometry of the camera setup by five dimensionless parameters. Maximum Likelihood (ML) estimates of these parameters were obtained in an error analysis. It is shown that the ML estimates can be found by application of a nonlinear least squares procedure. Compared to the standard unweighted least squares procedure, the ML method resulted in more accurate estimates without noticeable bias. The accuracy of the ML method was investigated in relation to the object aperture. The reconstruction problem appeared well conditioned as long as the object aperture is larger than 0.1 rad. The angle between the two viewing directions appeared to be the parameter that was most likely to cause major inaccuracies in the reconstruction of the 3-D positions of the markers. Hence, attempts to improve the robustness of the method should primarily focus on reduction of the error in this parameter.
Maximum likelihood techniques applied to quasi-elastic light scattering

NASA Technical Reports Server (NTRS)

Edwards, Robert V.

1992-01-01

There is a necessity of having an automatic procedure for reliable estimation of the quality of the measurement of particle size from QELS (Quasi-Elastic Light Scattering). Getting the measurement itself, before any error estimates can be made, is a problem because it is obtained by a very indirect measurement of a signal derived from the motion of particles in the system and requires the solution of an inverse problem. The eigenvalue structure of the transform that generates the signal is such that an arbitrarily small amount of noise can obliterate parts of any practical inversion spectrum. This project uses the Maximum Likelihood Estimation (MLE) as a framework to generate a theory and a functioning set of software to oversee the measurement process and extract the particle size information, while at the same time providing error estimates for those measurements. The theory involved verifying a correct form of the covariance matrix for the noise on the measurement and then estimating particle size parameters using a modified histogram approach.
Constrained inference in mixed-effects models for longitudinal data with application to hearing loss.

PubMed

Davidov, Ori; Rosen, Sophia

2011-04-01

In medical studies, endpoints are often measured for each patient longitudinally. The mixed-effects model has been a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, in hearing loss studies, we expect hearing to deteriorate with time. This means that hearing thresholds which reflect hearing acuity will, on average, increase over time. Therefore, the regression coefficients associated with the mean effect of time on hearing ability will be constrained. Such constraints should be accounted for in the analysis. We propose maximum likelihood estimation procedures, based on the expectation-conditional maximization either algorithm, to estimate the parameters of the model while accounting for the constraints on them. The proposed methods improve, in terms of mean square error, on the unconstrained estimators. In some settings, the improvement may be substantial. Hypotheses testing procedures that incorporate the constraints are developed. Specifically, likelihood ratio, Wald, and score tests are proposed and investigated. Their empirical significance levels and power are studied using simulations. It is shown that incorporating the constraints improves the mean squared error of the estimates and the power of the tests. These improvements may be substantial. The methodology is used to analyze a hearing loss study.
A Comparison of Missing-Data Procedures for Arima Time-Series Analysis

ERIC Educational Resources Information Center

Velicer, Wayne F.; Colby, Suzanne M.

2005-01-01

Missing data are a common practical problem for longitudinal designs. Time-series analysis is a longitudinal method that involves a large number of observations on a single unit. Four different missing-data methods (deletion, mean substitution, mean of adjacent observations, and maximum likelihood estimation) were evaluated. Computer-generated…
A Note on the Computation of the Second-Order Derivatives of the Elementary Symmetric Functions in the Rasch Model.

ERIC Educational Resources Information Center

Formann, Anton K.

1986-01-01

It is shown that for equal parameters explicit formulas exist, facilitating the application of the Newton-Raphson procedure to estimate the parameters in the Rasch model and related models according to the conditional maximum likelihood principle. (Author/LMO)
Local Influence and Robust Procedures for Mediation Analysis

ERIC Educational Resources Information Center

Zu, Jiyun; Yuan, Ke-Hai

2010-01-01

Existing studies of mediation models have been limited to normal-theory maximum likelihood (ML). Because real data in the social and behavioral sciences are seldom normally distributed and often contain outliers, classical methods generally lead to inefficient or biased parameter estimates. Consequently, the conclusions from a mediation analysis…
LS-APC v1.0: a tuning-free method for the linear inverse problem and its application to source-term determination

NASA Astrophysics Data System (ADS)

Tichý, Ondřej; Šmídl, Václav; Hofman, Radek; Stohl, Andreas

2016-11-01

Estimation of pollutant releases into the atmosphere is an important problem in the environmental sciences. It is typically formalized as an inverse problem using a linear model that can explain observable quantities (e.g., concentrations or deposition values) as a product of the source-receptor sensitivity (SRS) matrix obtained from an atmospheric transport model multiplied by the unknown source-term vector. Since this problem is typically ill-posed, current state-of-the-art methods are based on regularization of the problem and solution of a formulated optimization problem. This procedure depends on manual settings of uncertainties that are often very poorly quantified, effectively making them tuning parameters. We formulate a probabilistic model, that has the same maximum likelihood solution as the conventional method using pre-specified uncertainties. Replacement of the maximum likelihood solution by full Bayesian estimation also allows estimation of all tuning parameters from the measurements. The estimation procedure is based on the variational Bayes approximation which is evaluated by an iterative algorithm. The resulting method is thus very similar to the conventional approach, but with the possibility to also estimate all tuning parameters from the observations. The proposed algorithm is tested and compared with the standard methods on data from the European Tracer Experiment (ETEX) where advantages of the new method are demonstrated. A MATLAB implementation of the proposed algorithm is available for download.
Estimating Model Probabilities using Thermodynamic Markov Chain Monte Carlo Methods

NASA Astrophysics Data System (ADS)

Ye, M.; Liu, P.; Beerli, P.; Lu, D.; Hill, M. C.

2014-12-01

Markov chain Monte Carlo (MCMC) methods are widely used to evaluate model probability for quantifying model uncertainty. In a general procedure, MCMC simulations are first conducted for each individual model, and MCMC parameter samples are then used to approximate marginal likelihood of the model by calculating the geometric mean of the joint likelihood of the model and its parameters. It has been found the method of evaluating geometric mean suffers from the numerical problem of low convergence rate. A simple test case shows that even millions of MCMC samples are insufficient to yield accurate estimation of the marginal likelihood. To resolve this problem, a thermodynamic method is used to have multiple MCMC runs with different values of a heating coefficient between zero and one. When the heating coefficient is zero, the MCMC run is equivalent to a random walk MC in the prior parameter space; when the heating coefficient is one, the MCMC run is the conventional one. For a simple case with analytical form of the marginal likelihood, the thermodynamic method yields more accurate estimate than the method of using geometric mean. This is also demonstrated for a case of groundwater modeling with consideration of four alternative models postulated based on different conceptualization of a confining layer. This groundwater example shows that model probabilities estimated using the thermodynamic method are more reasonable than those obtained using the geometric method. The thermodynamic method is general, and can be used for a wide range of environmental problem for model uncertainty quantification.

Asymptotically optimum multialternative sequential procedures for discernment of processes minimizing average length of observations

NASA Astrophysics Data System (ADS)

Fishman, M. M.

1985-01-01

The problem of multialternative sequential discernment of processes is formulated in terms of conditionally optimum procedures minimizing the average length of observations, without any probabilistic assumptions about any one occurring process, rather than in terms of Bayes procedures minimizing the average risk. The problem is to find the procedure that will transform inequalities into equalities. The problem is formulated for various models of signal observation and data processing: (1) discernment of signals from background interference by a multichannel system; (2) discernment of pulse sequences with unknown time delay; (3) discernment of harmonic signals with unknown frequency. An asymptotically optimum sequential procedure is constructed which compares the statistics of the likelihood ratio with the mean-weighted likelihood ratio and estimates the upper bound for conditional average lengths of observations. This procedure is shown to remain valid as the upper bound for the probability of erroneous partial solutions decreases approaching zero and the number of hypotheses increases approaching infinity. It also remains valid under certain special constraints on the probability such as a threshold. A comparison with a fixed-length procedure reveals that this sequential procedure decreases the length of observations to one quarter, on the average, when the probability of erroneous partial solutions is low.
An empirical Bayes approach for the Poisson life distribution.

NASA Technical Reports Server (NTRS)

Canavos, G. C.

1973-01-01

A smooth empirical Bayes estimator is derived for the intensity parameter (hazard rate) in the Poisson distribution as used in life testing. The reliability function is also estimated either by using the empirical Bayes estimate of the parameter, or by obtaining the expectation of the reliability function. The behavior of the empirical Bayes procedure is studied through Monte Carlo simulation in which estimates of mean-squared errors of the empirical Bayes estimators are compared with those of conventional estimators such as minimum variance unbiased or maximum likelihood. Results indicate a significant reduction in mean-squared error of the empirical Bayes estimators over the conventional variety.
Deterministic annealing for density estimation by multivariate normal mixtures

NASA Astrophysics Data System (ADS)

Kloppenburg, Martin; Tavan, Paul

1997-03-01

An approach to maximum-likelihood density estimation by mixtures of multivariate normal distributions for large high-dimensional data sets is presented. Conventionally that problem is tackled by notoriously unstable expectation-maximization (EM) algorithms. We remove these instabilities by the introduction of soft constraints, enabling deterministic annealing. Our developments are motivated by the proof that algorithmically stable fuzzy clustering methods that are derived from statistical physics analogs are special cases of EM procedures.
Generalizing the Iterative Proportional Fitting Procedure.

DTIC Science & Technology

1980-04-01

Csiszar gives conditions under which P (R) exists (it is always unique) and develops a geometry of I-divergence by using an analogue of Pythagoras ...8217 Theorem . As our goal is to study maximum likelihood estimation in contingency tables, we turn briefly to the problem of estimating a multinomial...envoke a result of Csiszir (due originally to Kullback (1959)), giving the form of the density of the I-projection. Csiszar’s Theorem 3.1, which we
Development of advanced acreage estimation methods

NASA Technical Reports Server (NTRS)

Guseman, L. F., Jr. (Principal Investigator)

1980-01-01

The use of the AMOEBA clustering/classification algorithm was investigated as a basis for both a color display generation technique and maximum likelihood proportion estimation procedure. An approach to analyzing large data reduction systems was formulated and an exploratory empirical study of spatial correlation in LANDSAT data was also carried out. Topics addressed include: (1) development of multiimage color images; (2) spectral spatial classification algorithm development; (3) spatial correlation studies; and (4) evaluation of data systems.
Stochastic capture zone analysis of an arsenic-contaminated well using the generalized likelihood uncertainty estimator (GLUE) methodology

NASA Astrophysics Data System (ADS)

Morse, Brad S.; Pohll, Greg; Huntington, Justin; Rodriguez Castillo, Ramiro

2003-06-01

In 1992, Mexican researchers discovered concentrations of arsenic in excess of World Heath Organization (WHO) standards in several municipal wells in the Zimapan Valley of Mexico. This study describes a method to delineate a capture zone for one of the most highly contaminated wells to aid in future well siting. A stochastic approach was used to model the capture zone because of the high level of uncertainty in several input parameters. Two stochastic techniques were performed and compared: "standard" Monte Carlo analysis and the generalized likelihood uncertainty estimator (GLUE) methodology. The GLUE procedure differs from standard Monte Carlo analysis in that it incorporates a goodness of fit (termed a likelihood measure) in evaluating the model. This allows for more information (in this case, head data) to be used in the uncertainty analysis, resulting in smaller prediction uncertainty. Two likelihood measures are tested in this study to determine which are in better agreement with the observed heads. While the standard Monte Carlo approach does not aid in parameter estimation, the GLUE methodology indicates best fit models when hydraulic conductivity is approximately 10-6.5 m/s, with vertically isotropic conditions and large quantities of interbasin flow entering the basin. Probabilistic isochrones (capture zone boundaries) are then presented, and as predicted, the GLUE-derived capture zones are significantly smaller in area than those from the standard Monte Carlo approach.
The Inverse Problem for Confined Aquifer Flow: Identification and Estimation With Extensions

NASA Astrophysics Data System (ADS)

Loaiciga, Hugo A.; MariñO, Miguel A.

1987-01-01

The contributions of this work are twofold. First, a methodology for estimating the elements of parameter matrices in the governing equation of flow in a confined aquifer is developed. The estimation techniques for the distributed-parameter inverse problem pertain to linear least squares and generalized least squares methods. The linear relationship among the known heads and unknown parameters of the flow equation provides the background for developing criteria for determining the identifiability status of unknown parameters. Under conditions of exact or overidentification it is possible to develop statistically consistent parameter estimators and their asymptotic distributions. The estimation techniques, namely, two-stage least squares and three stage least squares, are applied to a specific groundwater inverse problem and compared between themselves and with an ordinary least squares estimator. The three-stage estimator provides the closer approximation to the actual parameter values, but it also shows relatively large standard errors as compared to the ordinary and two-stage estimators. The estimation techniques provide the parameter matrices required to simulate the unsteady groundwater flow equation. Second, a nonlinear maximum likelihood estimation approach to the inverse problem is presented. The statistical properties of maximum likelihood estimators are derived, and a procedure to construct confidence intervals and do hypothesis testing is given. The relative merits of the linear and maximum likelihood estimators are analyzed. Other topics relevant to the identification and estimation methodologies, i.e., a continuous-time solution to the flow equation, coping with noise-corrupted head measurements, and extension of the developed theory to nonlinear cases are also discussed. A simulation study is used to evaluate the methods developed in this study.
A Variance Distribution Model of Surface EMG Signals Based on Inverse Gamma Distribution.

PubMed

Hayashi, Hideaki; Furui, Akira; Kurita, Yuichi; Tsuji, Toshio

2017-11-01

Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force. Objective: This paper describes the formulation of a surface electromyogram (EMG) model capable of representing the variance distribution of EMG signals. Methods: In the model, EMG signals are handled based on a Gaussian white noise process with a mean of zero for each variance value. EMG signal variance is taken as a random variable that follows inverse gamma distribution, allowing the representation of noise superimposed onto this variance. Variance distribution estimation based on marginal likelihood maximization is also outlined in this paper. The procedure can be approximated using rectified and smoothed EMG signals, thereby allowing the determination of distribution parameters in real time at low computational cost. Results: A simulation experiment was performed to evaluate the accuracy of distribution estimation using artificially generated EMG signals, with results demonstrating that the proposed model's accuracy is higher than that of maximum-likelihood-based estimation. Analysis of variance distribution using real EMG data also suggested a relationship between variance distribution and signal-dependent noise. Conclusion: The study reported here was conducted to examine the performance of a proposed surface EMG model capable of representing variance distribution and a related distribution parameter estimation method. Experiments using artificial and real EMG data demonstrated the validity of the model. Significance: Variance distribution estimated using the proposed model exhibits potential in the estimation of muscle force.
Statistical Properties of Maximum Likelihood Estimators of Power Law Spectra Information

NASA Technical Reports Server (NTRS)

Howell, L. W.

2002-01-01

A simple power law model consisting of a single spectral index, a is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at the knee energy, E(sub k), to a steeper spectral index alpha(sub 2) greater than alpha(sub 1) above E(sub k). The Maximum likelihood (ML) procedure was developed for estimating the single parameter alpha(sub 1) of a simple power law energy spectrum and generalized to estimate the three spectral parameters of the broken power law energy spectrum from simulated detector responses and real cosmic-ray data. The statistical properties of the ML estimator were investigated and shown to have the three desirable properties: (P1) consistency (asymptotically unbiased). (P2) efficiency asymptotically attains the Cramer-Rao minimum variance bound), and (P3) asymptotically normally distributed, under a wide range of potential detector response functions. Attainment of these properties necessarily implies that the ML estimation procedure provides the best unbiased estimator possible. While simulation studies can easily determine if a given estimation procedure provides an unbiased estimate of the spectra information, and whether or not the estimator is approximately normally distributed, attainment of the Cramer-Rao bound (CRB) can only he ascertained by calculating the CRB for an assumed energy spectrum-detector response function combination, which can be quite formidable in practice. However. the effort in calculating the CRB is very worthwhile because it provides the necessary means to compare the efficiency of competing estimation techniques and, furthermore, provides a stopping rule in the search for the best unbiased estimator. Consequently, the CRB for both the simple and broken power law energy spectra are derived herein and the conditions under which they are attained in practice are investigated. The ML technique is then extended to estimate spectra information from an arbitrary number of astrophysics data sets produced by vastly different science instruments. This theory and its successful implementation will facilitate the interpretation of spectral information from multiple astrophysics missions and thereby permit the derivation of superior spectral parameter estimates based on the combination of data sets.
Estimating the probability of rare events: addressing zero failure data.

PubMed

Quigley, John; Revie, Matthew

2011-07-01

Traditional statistical procedures for estimating the probability of an event result in an estimate of zero when no events are realized. Alternative inferential procedures have been proposed for the situation where zero events have been realized but often these are ad hoc, relying on selecting methods dependent on the data that have been realized. Such data-dependent inference decisions violate fundamental statistical principles, resulting in estimation procedures whose benefits are difficult to assess. In this article, we propose estimating the probability of an event occurring through minimax inference on the probability that future samples of equal size realize no more events than that in the data on which the inference is based. Although motivated by inference on rare events, the method is not restricted to zero event data and closely approximates the maximum likelihood estimate (MLE) for nonzero data. The use of the minimax procedure provides a risk adverse inferential procedure where there are no events realized. A comparison is made with the MLE and regions of the underlying probability are identified where this approach is superior. Moreover, a comparison is made with three standard approaches to supporting inference where no event data are realized, which we argue are unduly pessimistic. We show that for situations of zero events the estimator can be simply approximated with 1/2.5n, where n is the number of trials. © 2011 Society for Risk Analysis.
Testing homogeneity of proportion ratios for stratified correlated bilateral data in two-arm randomized clinical trials.

PubMed

Pei, Yanbo; Tian, Guo-Liang; Tang, Man-Lai

2014-11-10

Stratified data analysis is an important research topic in many biomedical studies and clinical trials. In this article, we develop five test statistics for testing the homogeneity of proportion ratios for stratified correlated bilateral binary data based on an equal correlation model assumption. Bootstrap procedures based on these test statistics are also considered. To evaluate the performance of these statistics and procedures, we conduct Monte Carlo simulations to study their empirical sizes and powers under various scenarios. Our results suggest that the procedure based on score statistic performs well generally and is highly recommended. When the sample size is large, procedures based on the commonly used weighted least square estimate and logarithmic transformation with Mantel-Haenszel estimate are recommended as they do not involve any computation of maximum likelihood estimates requiring iterative algorithms. We also derive approximate sample size formulas based on the recommended test procedures. Finally, we apply the proposed methods to analyze a multi-center randomized clinical trial for scleroderma patients. Copyright © 2014 John Wiley & Sons, Ltd.
Parametric Model Based On Imputations Techniques for Partly Interval Censored Data

NASA Astrophysics Data System (ADS)

Zyoud, Abdallah; Elfaki, F. A. M.; Hrairi, Meftah

2017-12-01

The term ‘survival analysis’ has been used in a broad sense to describe collection of statistical procedures for data analysis. In this case, outcome variable of interest is time until an event occurs where the time to failure of a specific experimental unit might be censored which can be right, left, interval, and Partly Interval Censored data (PIC). In this paper, analysis of this model was conducted based on parametric Cox model via PIC data. Moreover, several imputation techniques were used, which are: midpoint, left & right point, random, mean, and median. Maximum likelihood estimate was considered to obtain the estimated survival function. These estimations were then compared with the existing model, such as: Turnbull and Cox model based on clinical trial data (breast cancer data), for which it showed the validity of the proposed model. Result of data set indicated that the parametric of Cox model proved to be more superior in terms of estimation of survival functions, likelihood ratio tests, and their P-values. Moreover, based on imputation techniques; the midpoint, random, mean, and median showed better results with respect to the estimation of survival function.
Free kick instead of cross-validation in maximum-likelihood refinement of macromolecular crystal structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pražnikar, Jure; University of Primorska,; Turk, Dušan, E-mail: dusan.turk@ijs.si

2014-12-01

The maximum-likelihood free-kick target, which calculates model error estimates from the work set and a randomly displaced model, proved superior in the accuracy and consistency of refinement of crystal structures compared with the maximum-likelihood cross-validation target, which calculates error estimates from the test set and the unperturbed model. The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. Theymore » utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of R{sub free} or may leave it out completely.« less
Outcome-Dependent Sampling Design and Inference for Cox's Proportional Hazards Model.

PubMed

Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P; Zhou, Haibo

2016-11-01

We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study.
PSYCHOACOUSTICS: a comprehensive MATLAB toolbox for auditory testing.

PubMed

Soranzo, Alessandro; Grassi, Massimo

2014-01-01

PSYCHOACOUSTICS is a new MATLAB toolbox which implements three classic adaptive procedures for auditory threshold estimation. The first includes those of the Staircase family (method of limits, simple up-down and transformed up-down); the second is the Parameter Estimation by Sequential Testing (PEST); and the third is the Maximum Likelihood Procedure (MLP). The toolbox comes with more than twenty built-in experiments each provided with the recommended (default) parameters. However, if desired, these parameters can be modified through an intuitive and user friendly graphical interface and stored for future use (no programming skills are required). Finally, PSYCHOACOUSTICS is very flexible as it comes with several signal generators and can be easily extended for any experiment.
PSYCHOACOUSTICS: a comprehensive MATLAB toolbox for auditory testing

PubMed Central

Soranzo, Alessandro; Grassi, Massimo

2014-01-01

PSYCHOACOUSTICS is a new MATLAB toolbox which implements three classic adaptive procedures for auditory threshold estimation. The first includes those of the Staircase family (method of limits, simple up-down and transformed up-down); the second is the Parameter Estimation by Sequential Testing (PEST); and the third is the Maximum Likelihood Procedure (MLP). The toolbox comes with more than twenty built-in experiments each provided with the recommended (default) parameters. However, if desired, these parameters can be modified through an intuitive and user friendly graphical interface and stored for future use (no programming skills are required). Finally, PSYCHOACOUSTICS is very flexible as it comes with several signal generators and can be easily extended for any experiment. PMID:25101013
Efficient Levenberg-Marquardt minimization of the maximum likelihood estimator for Poisson deviates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laurence, T; Chromy, B

2009-11-10

Histograms of counted events are Poisson distributed, but are typically fitted without justification using nonlinear least squares fitting. The more appropriate maximum likelihood estimator (MLE) for Poisson distributed data is seldom used. We extend the use of the Levenberg-Marquardt algorithm commonly used for nonlinear least squares minimization for use with the MLE for Poisson distributed data. In so doing, we remove any excuse for not using this more appropriate MLE. We demonstrate the use of the algorithm and the superior performance of the MLE using simulations and experiments in the context of fluorescence lifetime imaging. Scientists commonly form histograms ofmore » counted events from their data, and extract parameters by fitting to a specified model. Assuming that the probability of occurrence for each bin is small, event counts in the histogram bins will be distributed according to the Poisson distribution. We develop here an efficient algorithm for fitting event counting histograms using the maximum likelihood estimator (MLE) for Poisson distributed data, rather than the non-linear least squares measure. This algorithm is a simple extension of the common Levenberg-Marquardt (L-M) algorithm, is simple to implement, quick and robust. Fitting using a least squares measure is most common, but it is the maximum likelihood estimator only for Gaussian-distributed data. Non-linear least squares methods may be applied to event counting histograms in cases where the number of events is very large, so that the Poisson distribution is well approximated by a Gaussian. However, it is not easy to satisfy this criterion in practice - which requires a large number of events. It has been well-known for years that least squares procedures lead to biased results when applied to Poisson-distributed data; a recent paper providing extensive characterization of these biases in exponential fitting is given. The more appropriate measure based on the maximum likelihood estimator (MLE) for the Poisson distribution is also well known, but has not become generally used. This is primarily because, in contrast to non-linear least squares fitting, there has been no quick, robust, and general fitting method. In the field of fluorescence lifetime spectroscopy and imaging, there have been some efforts to use this estimator through minimization routines such as Nelder-Mead optimization, exhaustive line searches, and Gauss-Newton minimization. Minimization based on specific one- or multi-exponential models has been used to obtain quick results, but this procedure does not allow the incorporation of the instrument response, and is not generally applicable to models found in other fields. Methods for using the MLE for Poisson-distributed data have been published by the wider spectroscopic community, including iterative minimization schemes based on Gauss-Newton minimization. The slow acceptance of these procedures for fitting event counting histograms may also be explained by the use of the ubiquitous, fast Levenberg-Marquardt (L-M) fitting procedure for fitting non-linear models using least squares fitting (simple searches obtain {approx}10000 references - this doesn't include those who use it, but don't know they are using it). The benefits of L-M include a seamless transition between Gauss-Newton minimization and downward gradient minimization through the use of a regularization parameter. This transition is desirable because Gauss-Newton methods converge quickly, but only within a limited domain of convergence; on the other hand the downward gradient methods have a much wider domain of convergence, but converge extremely slowly nearer the minimum. L-M has the advantages of both procedures: relative insensitivity to initial parameters and rapid convergence. Scientists, when wanting an answer quickly, will fit data using L-M, get an answer, and move on. Only those that are aware of the bias issues will bother to fit using the more appropriate MLE for Poisson deviates. However, since there is a simple, analytical formula for the appropriate MLE measure for Poisson deviates, it is inexcusable that least squares estimators are used almost exclusively when fitting event counting histograms. There have been ways found to use successive non-linear least squares fitting to obtain similarly unbiased results, but this procedure is justified by simulation, must be re-tested when conditions change significantly, and requires two successive fits. There is a great need for a fitting routine for the MLE estimator for Poisson deviates that has convergence domains and rates comparable to the non-linear least squares L-M fitting. We show in this report that a simple way to achieve that goal is to use the L-M fitting procedure not to minimize the least squares measure, but the MLE for Poisson deviates.« less
On modeling animal movements using Brownian motion with measurement error.

PubMed

Pozdnyakov, Vladimir; Meyer, Thomas; Wang, Yu-Bo; Yan, Jun

2014-02-01

Modeling animal movements with Brownian motion (or more generally by a Gaussian process) has a long tradition in ecological studies. The recent Brownian bridge movement model (BBMM), which incorporates measurement errors, has been quickly adopted by ecologists because of its simplicity and tractability. We discuss some nontrivial properties of the discrete-time stochastic process that results from observing a Brownian motion with added normal noise at discrete times. In particular, we demonstrate that the observed sequence of random variables is not Markov. Consequently the expected occupation time between two successively observed locations does not depend on just those two observations; the whole path must be taken into account. Nonetheless, the exact likelihood function of the observed time series remains tractable; it requires only sparse matrix computations. The likelihood-based estimation procedure is described in detail and compared to the BBMM estimation.
A hyperbolastic type-I diffusion process: Parameter estimation by means of the firefly algorithm.

PubMed

Barrera, Antonio; Román-Román, Patricia; Torres-Ruiz, Francisco

2018-01-01

A stochastic diffusion process, whose mean function is a hyperbolastic curve of type I, is presented. The main characteristics of the process are studied and the problem of maximum likelihood estimation for the parameters of the process is considered. To this end, the firefly metaheuristic optimization algorithm is applied after bounding the parametric space by a stagewise procedure. Some examples based on simulated sample paths and real data illustrate this development. Copyright © 2017 Elsevier B.V. All rights reserved.
Monte Carlo studies of ocean wind vector measurements by SCATT: Objective criteria and maximum likelihood estimates for removal of aliases, and effects of cell size on accuracy of vector winds

NASA Technical Reports Server (NTRS)

Pierson, W. J.

1982-01-01

The scatterometer on the National Oceanic Satellite System (NOSS) is studied by means of Monte Carlo techniques so as to determine the effect of two additional antennas for alias (or ambiguity) removal by means of an objective criteria technique and a normalized maximum likelihood estimator. Cells nominally 10 km by 10 km, 10 km by 50 km, and 50 km by 50 km are simulated for winds of 4, 8, 12 and 24 m/s and incidence angles of 29, 39, 47, and 53.5 deg for 15 deg changes in direction. The normalized maximum likelihood estimate (MLE) is correct a large part of the time, but the objective criterion technique is recommended as a reserve, and more quickly computed, procedure. Both methods for alias removal depend on the differences in the present model function at upwind and downwind. For 10 km by 10 km cells, it is found that the MLE method introduces a correlation between wind speed errors and aspect angle (wind direction) errors that can be as high as 0.8 or 0.9 and that the wind direction errors are unacceptably large, compared to those obtained for the SASS for similar assumptions.

A Recommended Procedure for Estimating the Cosmic-Ray Spectral Parameter of a Simple Power Law With Applications to Detector Design

NASA Technical Reports Server (NTRS)

Howell, L. W.

2001-01-01

A simple power law model consisting of a single spectral index alpha-1 is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV. Two procedures for estimating alpha-1 the method of moments and maximum likelihood (ML), are developed and their statistical performance compared. It is concluded that the ML procedure attains the most desirable statistical properties and is hence the recommended statistical estimation procedure for estimating alpha-1. The ML procedure is then generalized for application to a set of real cosmic-ray data and thereby makes this approach applicable to existing cosmic-ray data sets. Several other important results, such as the relationship between collecting power and detector energy resolution, as well as inclusion of a non-Gaussian detector response function, are presented. These results have many practical benefits in the design phase of a cosmic-ray detector as they permit instrument developers to make important trade studies in design parameters as a function of one of the science objectives. This is particularly important for space-based detectors where physical parameters, such as dimension and weight, impose rigorous practical limits to the design envelope.
Robustness of fit indices to outliers and leverage observations in structural equation modeling.

PubMed

Yuan, Ke-Hai; Zhong, Xiaoling

2013-06-01

Normal-distribution-based maximum likelihood (NML) is the most widely used method in structural equation modeling (SEM), although practical data tend to be nonnormally distributed. The effect of nonnormally distributed data or data contamination on the normal-distribution-based likelihood ratio (LR) statistic is well understood due to many analytical and empirical studies. In SEM, fit indices are used as widely as the LR statistic. In addition to NML, robust procedures have been developed for more efficient and less biased parameter estimates with practical data. This article studies the effect of outliers and leverage observations on fit indices following NML and two robust methods. Analysis and empirical results indicate that good leverage observations following NML and one of the robust methods lead most fit indices to give more support to the substantive model. While outliers tend to make a good model superficially bad according to many fit indices following NML, they have little effect on those following the two robust procedures. Implications of the results to data analysis are discussed, and recommendations are provided regarding the use of estimation methods and interpretation of fit indices. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
The application of parameter estimation to flight measurements to obtain lateral-directional stability derivatives of an augmented jet-flap STOL airplane

NASA Technical Reports Server (NTRS)

Stephenson, J. D.

1983-01-01

Flight experiments with an augmented jet flap STOL aircraft provided data from which the lateral directional stability and control derivatives were calculated by applying a linear regression parameter estimation procedure. The tests, which were conducted with the jet flaps set at a 65 deg deflection, covered a large range of angles of attack and engine power settings. The effect of changing the angle of the jet thrust vector was also investigated. Test results are compared with stability derivatives that had been predicted. The roll damping derived from the tests was significantly larger than had been predicted, whereas the other derivatives were generally in agreement with the predictions. Results obtained using a maximum likelihood estimation procedure are compared with those from the linear regression solutions.
Outcome-Dependent Sampling Design and Inference for Cox’s Proportional Hazards Model

PubMed Central

Yu, Jichang; Liu, Yanyan; Cai, Jianwen; Sandler, Dale P.; Zhou, Haibo

2016-01-01

We propose a cost-effective outcome-dependent sampling design for the failure time data and develop an efficient inference procedure for data collected with this design. To account for the biased sampling scheme, we derive estimators from a weighted partial likelihood estimating equation. The proposed estimators for regression parameters are shown to be consistent and asymptotically normally distributed. A criteria that can be used to optimally implement the ODS design in practice is proposed and studied. The small sample performance of the proposed method is evaluated by simulation studies. The proposed design and inference procedure is shown to be statistically more powerful than existing alternative designs with the same sample sizes. We illustrate the proposed method with an existing real data from the Cancer Incidence and Mortality of Uranium Miners Study. PMID:28090134
An evaluation of inferential procedures for adaptive clinical trial designs with pre-specified rules for modifying the sample size.

PubMed

Levin, Gregory P; Emerson, Sarah C; Emerson, Scott S

2014-09-01

Many papers have introduced adaptive clinical trial methods that allow modifications to the sample size based on interim estimates of treatment effect. There has been extensive commentary on type I error control and efficiency considerations, but little research on estimation after an adaptive hypothesis test. We evaluate the reliability and precision of different inferential procedures in the presence of an adaptive design with pre-specified rules for modifying the sampling plan. We extend group sequential orderings of the outcome space based on the stage at stopping, likelihood ratio statistic, and sample mean to the adaptive setting in order to compute median-unbiased point estimates, exact confidence intervals, and P-values uniformly distributed under the null hypothesis. The likelihood ratio ordering is found to average shorter confidence intervals and produce higher probabilities of P-values below important thresholds than alternative approaches. The bias adjusted mean demonstrates the lowest mean squared error among candidate point estimates. A conditional error-based approach in the literature has the benefit of being the only method that accommodates unplanned adaptations. We compare the performance of this and other methods in order to quantify the cost of failing to plan ahead in settings where adaptations could realistically be pre-specified at the design stage. We find the cost to be meaningful for all designs and treatment effects considered, and to be substantial for designs frequently proposed in the literature. © 2014, The International Biometric Society.
A unified procedure for meta-analytic evaluation of surrogate end points in randomized clinical trials

PubMed Central

Dai, James Y.; Hughes, James P.

2012-01-01

The meta-analytic approach to evaluating surrogate end points assesses the predictiveness of treatment effect on the surrogate toward treatment effect on the clinical end point based on multiple clinical trials. Definition and estimation of the correlation of treatment effects were developed in linear mixed models and later extended to binary or failure time outcomes on a case-by-case basis. In a general regression setting that covers nonnormal outcomes, we discuss in this paper several metrics that are useful in the meta-analytic evaluation of surrogacy. We propose a unified 3-step procedure to assess these metrics in settings with binary end points, time-to-event outcomes, or repeated measures. First, the joint distribution of estimated treatment effects is ascertained by an estimating equation approach; second, the restricted maximum likelihood method is used to estimate the means and the variance components of the random treatment effects; finally, confidence intervals are constructed by a parametric bootstrap procedure. The proposed method is evaluated by simulations and applications to 2 clinical trials. PMID:22394448
Estimation of Model's Marginal likelihood Using Adaptive Sparse Grid Surrogates in Bayesian Model Averaging

NASA Astrophysics Data System (ADS)

Zeng, X.

2015-12-01

A large number of model executions are required to obtain alternative conceptual models' predictions and their posterior probabilities in Bayesian model averaging (BMA). The posterior model probability is estimated through models' marginal likelihood and prior probability. The heavy computation burden hinders the implementation of BMA prediction, especially for the elaborated marginal likelihood estimator. For overcoming the computation burden of BMA, an adaptive sparse grid (SG) stochastic collocation method is used to build surrogates for alternative conceptual models through the numerical experiment of a synthetical groundwater model. BMA predictions depend on model posterior weights (or marginal likelihoods), and this study also evaluated four marginal likelihood estimators, including arithmetic mean estimator (AME), harmonic mean estimator (HME), stabilized harmonic mean estimator (SHME), and thermodynamic integration estimator (TIE). The results demonstrate that TIE is accurate in estimating conceptual models' marginal likelihoods. The BMA-TIE has better predictive performance than other BMA predictions. TIE has high stability for estimating conceptual model's marginal likelihood. The repeated estimated conceptual model's marginal likelihoods by TIE have significant less variability than that estimated by other estimators. In addition, the SG surrogates are efficient to facilitate BMA predictions, especially for BMA-TIE. The number of model executions needed for building surrogates is 4.13%, 6.89%, 3.44%, and 0.43% of the required model executions of BMA-AME, BMA-HME, BMA-SHME, and BMA-TIE, respectively.
Variable selection in subdistribution hazard frailty models with competing risks data

PubMed Central

Do Ha, Il; Lee, Minjung; Oh, Seungyoung; Jeong, Jong-Hyeon; Sylvester, Richard; Lee, Youngjo

2014-01-01

The proportional subdistribution hazards model (i.e. Fine-Gray model) has been widely used for analyzing univariate competing risks data. Recently, this model has been extended to clustered competing risks data via frailty. To the best of our knowledge, however, there has been no literature on variable selection method for such competing risks frailty models. In this paper, we propose a simple but unified procedure via a penalized h-likelihood (HL) for variable selection of fixed effects in a general class of subdistribution hazard frailty models, in which random effects may be shared or correlated. We consider three penalty functions (LASSO, SCAD and HL) in our variable selection procedure. We show that the proposed method can be easily implemented using a slight modification to existing h-likelihood estimation approaches. Numerical studies demonstrate that the proposed procedure using the HL penalty performs well, providing a higher probability of choosing the true model than LASSO and SCAD methods without losing prediction accuracy. The usefulness of the new method is illustrated using two actual data sets from multi-center clinical trials. PMID:25042872
A class of semiparametric cure models with current status data.

PubMed

Diao, Guoqing; Yuan, Ao

2018-02-08

Current status data occur in many biomedical studies where we only know whether the event of interest occurs before or after a particular time point. In practice, some subjects may never experience the event of interest, i.e., a certain fraction of the population is cured or is not susceptible to the event of interest. We consider a class of semiparametric transformation cure models for current status data with a survival fraction. This class includes both the proportional hazards and the proportional odds cure models as two special cases. We develop efficient likelihood-based estimation and inference procedures. We show that the maximum likelihood estimators for the regression coefficients are consistent, asymptotically normal, and asymptotically efficient. Simulation studies demonstrate that the proposed methods perform well in finite samples. For illustration, we provide an application of the models to a study on the calcification of the hydrogel intraocular lenses.
Determining the accuracy of maximum likelihood parameter estimates with colored residuals

NASA Technical Reports Server (NTRS)

Morelli, Eugene A.; Klein, Vladislav

1994-01-01

An important part of building high fidelity mathematical models based on measured data is calculating the accuracy associated with statistical estimates of the model parameters. Indeed, without some idea of the accuracy of parameter estimates, the estimates themselves have limited value. In this work, an expression based on theoretical analysis was developed to properly compute parameter accuracy measures for maximum likelihood estimates with colored residuals. This result is important because experience from the analysis of measured data reveals that the residuals from maximum likelihood estimation are almost always colored. The calculations involved can be appended to conventional maximum likelihood estimation algorithms. Simulated data runs were used to show that the parameter accuracy measures computed with this technique accurately reflect the quality of the parameter estimates from maximum likelihood estimation without the need for analysis of the output residuals in the frequency domain or heuristically determined multiplication factors. The result is general, although the application studied here is maximum likelihood estimation of aerodynamic model parameters from flight test data.
Classification of longitudinal data through a semiparametric mixed-effects model based on lasso-type estimators.

PubMed

Arribas-Gil, Ana; De la Cruz, Rolando; Lebarbier, Emilie; Meza, Cristian

2015-06-01

We propose a classification method for longitudinal data. The Bayes classifier is classically used to determine a classification rule where the underlying density in each class needs to be well modeled and estimated. This work is motivated by a real dataset of hormone levels measured at the early stages of pregnancy that can be used to predict normal versus abnormal pregnancy outcomes. The proposed model, which is a semiparametric linear mixed-effects model (SLMM), is a particular case of the semiparametric nonlinear mixed-effects class of models (SNMM) in which finite dimensional (fixed effects and variance components) and infinite dimensional (an unknown function) parameters have to be estimated. In SNMM's maximum likelihood estimation is performed iteratively alternating parametric and nonparametric procedures. However, if one can make the assumption that the random effects and the unknown function interact in a linear way, more efficient estimation methods can be used. Our contribution is the proposal of a unified estimation procedure based on a penalized EM-type algorithm. The Expectation and Maximization steps are explicit. In this latter step, the unknown function is estimated in a nonparametric fashion using a lasso-type procedure. A simulation study and an application on real data are performed. © 2015, The International Biometric Society.
Inverse sequential procedures for the monitoring of time series

NASA Technical Reports Server (NTRS)

Radok, Uwe; Brown, Timothy

1993-01-01

Climate changes traditionally have been detected from long series of observations and long after they happened. The 'inverse sequential' monitoring procedure is designed to detect changes as soon as they occur. Frequency distribution parameters are estimated both from the most recent existing set of observations and from the same set augmented by 1,2,...j new observations. Individual-value probability products ('likelihoods') are then calculated which yield probabilities for erroneously accepting the existing parameter(s) as valid for the augmented data set and vice versa. A parameter change is signaled when these probabilities (or a more convenient and robust compound 'no change' probability) show a progressive decrease. New parameters are then estimated from the new observations alone to restart the procedure. The detailed algebra is developed and tested for Gaussian means and variances, Poisson and chi-square means, and linear or exponential trends; a comprehensive and interactive Fortran program is provided in the appendix.
Maximum Likelihood Item Easiness Models for Test Theory Without an Answer Key

PubMed Central

Batchelder, William H.

2014-01-01

Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce two extensions to the basic model in order to account for item rating easiness/difficulty. The first extension is a multiplicative model and the second is an additive model. We show how the multiplicative model is related to the Rasch model. We describe several maximum-likelihood estimation procedures for the models and discuss issues of model fit and identifiability. We describe how the CCT models could be used to give alternative consensus-based measures of reliability. We demonstrate the utility of both the basic and extended models on a set of essay rating data and give ideas for future research. PMID:29795812
A Maximum Likelihood Approach to Functional Mapping of Longitudinal Binary Traits

PubMed Central

Wang, Chenguang; Li, Hongying; Wang, Zhong; Wang, Yaqun; Wang, Ningtao; Wang, Zuoheng; Wu, Rongling

2013-01-01

Despite their importance in biology and biomedicine, genetic mapping of binary traits that change over time has not been well explored. In this article, we develop a statistical model for mapping quantitative trait loci (QTLs) that govern longitudinal responses of binary traits. The model is constructed within the maximum likelihood framework by which the association between binary responses is modeled in terms of conditional log odds-ratios. With this parameterization, the maximum likelihood estimates (MLEs) of marginal mean parameters are robust to the misspecification of time dependence. We implement an iterative procedures to obtain the MLEs of QTL genotype-specific parameters that define longitudinal binary responses. The usefulness of the model was validated by analyzing a real example in rice. Simulation studies were performed to investigate the statistical properties of the model, showing that the model has power to identify and map specific QTLs responsible for the temporal pattern of binary traits. PMID:23183762
Uncertainty analysis of signal deconvolution using a measured instrument response function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hartouni, E. P.; Beeman, B.; Caggiano, J. A.

2016-10-05

A common analysis procedure minimizes the ln-likelihood that a set of experimental observables matches a parameterized model of the observation. The model includes a description of the underlying physical process as well as the instrument response function (IRF). Here, we investigate the National Ignition Facility (NIF) neutron time-of-flight (nTOF) spectrometers, the IRF is constructed from measurements and models. IRF measurements have a finite precision that can make significant contributions to the uncertainty estimate of the physical model’s parameters. Finally, we apply a Bayesian analysis to properly account for IRF uncertainties in calculating the ln-likelihood function used to find the optimummore » physical parameters.« less
Work Status and Return to the Workforce after Coronary Artery Bypass Grafting and/or Heart Valve Surgery: A One-Year-Follow Up Study.

PubMed

Fonager, Kirsten; Lundbye-Christensen, Søren; Andreasen, Jan Jesper; Futtrup, Mikkel; Christensen, Anette Luther; Ahmad, Khalil; Nørgaard, Martin Agge

2014-01-01

Background. Several characteristics appear to be important for estimating the likelihood of reentering the workforce after surgery. The aim of the present study was to describe work status in a two-year time period around the time of cardiac surgery and estimate the probability of returning to the workforce. Methods. We included 681 patients undergoing coronary artery bypass grafting and/or heart valve procedures from 2003 to 2007 in the North Denmark Region. We linked hospital data to data in the DREAM database which holds information of everyone receiving social benefits. Results. At the time of surgery 17.3% were allocated disability pension and 2.3% were allocated a permanent part-time benefit. Being unemployed one year before surgery reduced the likelihood of return to the workforce (RR = 0.74 (0.60-0.92)) whereas unemployment at the time of surgery had no impact on return to the workforce (RR = 0.96 (0.78-1.18)). Sickness absence before surgery reduced the likelihood of return to the workforce. Conclusion. This study found the work status before surgery to be associated with the likelihood of return to the workforce within one year after surgery. Before surgery one-fifth of the population either was allocated disability pension or received a permanent part-time benefit.
Work Status and Return to the Workforce after Coronary Artery Bypass Grafting and/or Heart Valve Surgery: A One-Year-Follow Up Study

PubMed Central

Fonager, Kirsten; Lundbye-Christensen, Søren; Andreasen, Jan Jesper; Futtrup, Mikkel; Christensen, Anette Luther; Ahmad, Khalil; Nørgaard, Martin Agge

2014-01-01

Background. Several characteristics appear to be important for estimating the likelihood of reentering the workforce after surgery. The aim of the present study was to describe work status in a two-year time period around the time of cardiac surgery and estimate the probability of returning to the workforce. Methods. We included 681 patients undergoing coronary artery bypass grafting and/or heart valve procedures from 2003 to 2007 in the North Denmark Region. We linked hospital data to data in the DREAM database which holds information of everyone receiving social benefits. Results. At the time of surgery 17.3% were allocated disability pension and 2.3% were allocated a permanent part-time benefit. Being unemployed one year before surgery reduced the likelihood of return to the workforce (RR = 0.74 (0.60–0.92)) whereas unemployment at the time of surgery had no impact on return to the workforce (RR = 0.96 (0.78–1.18)). Sickness absence before surgery reduced the likelihood of return to the workforce. Conclusion. This study found the work status before surgery to be associated with the likelihood of return to the workforce within one year after surgery. Before surgery one-fifth of the population either was allocated disability pension or received a permanent part-time benefit. PMID:25024848
Finite mixture model: A maximum likelihood estimation approach on time series data

NASA Astrophysics Data System (ADS)

Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad

2014-09-01

Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Flexible Modeling of Survival Data with Covariates Subject to Detection Limits via Multiple Imputation.

PubMed

Bernhardt, Paul W; Wang, Huixia Judy; Zhang, Daowen

2014-01-01

Models for survival data generally assume that covariates are fully observed. However, in medical studies it is not uncommon for biomarkers to be censored at known detection limits. A computationally-efficient multiple imputation procedure for modeling survival data with covariates subject to detection limits is proposed. This procedure is developed in the context of an accelerated failure time model with a flexible seminonparametric error distribution. The consistency and asymptotic normality of the multiple imputation estimator are established and a consistent variance estimator is provided. An iterative version of the proposed multiple imputation algorithm that approximates the EM algorithm for maximum likelihood is also suggested. Simulation studies demonstrate that the proposed multiple imputation methods work well while alternative methods lead to estimates that are either biased or more variable. The proposed methods are applied to analyze the dataset from a recently-conducted GenIMS study.
SEPARABLE FACTOR ANALYSIS WITH APPLICATIONS TO MORTALITY DATA

PubMed Central

Fosdick, Bailey K.; Hoff, Peter D.

2014-01-01

Human mortality data sets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to account for other dependencies can lead to inefficient estimates of regression parameters, inaccurate standard errors and poor predictions. An alternative to assuming independent errors is to allow for dependence along each dimension of the array using a separable covariance model. However, the number of parameters in this model increases rapidly with the dimensions of the array and, for many arrays, maximum likelihood estimates of the covariance parameters do not exist. In this paper, we propose a submodel of the separable covariance model that estimates the covariance matrix for each dimension as having factor analytic structure. This model can be viewed as an extension of factor analysis to array-valued data, as it uses a factor model to estimate the covariance along each dimension of the array. We discuss properties of this model as they relate to ordinary factor analysis, describe maximum likelihood and Bayesian estimation methods, and provide a likelihood ratio testing procedure for selecting the factor model ranks. We apply this methodology to the analysis of data from the Human Mortality Database, and show in a cross-validation experiment how it outperforms simpler methods. Additionally, we use this model to impute mortality rates for countries that have no mortality data for several years. Unlike other approaches, our methodology is able to estimate similarities between the mortality rates of countries, time periods and sexes, and use this information to assist with the imputations. PMID:25489353

Multistage classification of multispectral Earth observational data: The design approach

NASA Technical Reports Server (NTRS)

Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.

1981-01-01

An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.
Estimation of mating system parameters in plant populations using marker loci with null alleles.

PubMed

Ross, H A

1986-06-01

An Expectation-Maximization (EM)-algorithm procedure is presented that extends Cheliak et al. (1983) method of maximum-likelihood estimation of mating system parameters of mixed mating system models. The extension permits the estimation of the rate of self-fertilization (s) and allele frequencies (Pi) at loci in outcrossing pollen, at marker loci having recessive null alleles. The algorithm makes use of maternal and filial genotypic arrays obtained by the electrophoretic analysis of cohorts of progeny. The genotypes of maternal plants must be known. Explicit equations are given for cases when the genotype of the maternal gamete inherited by a seed can (gymnosperms) or cannot (angiosperms) be determined. The procedure can accommodate any number of codominant alleles, but only one recessive null allele at each locus. An example, using actual data from Pinus banksiana, is presented to illustrate the application of this EM algorithm to the estimation of mating system parameters using marker loci having both codominant and recessive alleles.
Statistical inference for extended or shortened phase II studies based on Simon's two-stage designs.

PubMed

Zhao, Junjun; Yu, Menggang; Feng, Xi-Ping

2015-06-07

Simon's two-stage designs are popular choices for conducting phase II clinical trials, especially in the oncology trials to reduce the number of patients placed on ineffective experimental therapies. Recently Koyama and Chen (2008) discussed how to conduct proper inference for such studies because they found that inference procedures used with Simon's designs almost always ignore the actual sampling plan used. In particular, they proposed an inference method for studies when the actual second stage sample sizes differ from planned ones. We consider an alternative inference method based on likelihood ratio. In particular, we order permissible sample paths under Simon's two-stage designs using their corresponding conditional likelihood. In this way, we can calculate p-values using the common definition: the probability of obtaining a test statistic value at least as extreme as that observed under the null hypothesis. In addition to providing inference for a couple of scenarios where Koyama and Chen's method can be difficult to apply, the resulting estimate based on our method appears to have certain advantage in terms of inference properties in many numerical simulations. It generally led to smaller biases and narrower confidence intervals while maintaining similar coverages. We also illustrated the two methods in a real data setting. Inference procedures used with Simon's designs almost always ignore the actual sampling plan. Reported P-values, point estimates and confidence intervals for the response rate are not usually adjusted for the design's adaptiveness. Proper statistical inference procedures should be used.
A New Maximum Likelihood Approach for Free Energy Profile Construction from Molecular Simulations

PubMed Central

Lee, Tai-Sung; Radak, Brian K.; Pabis, Anna; York, Darrin M.

2013-01-01

A novel variational method for construction of free energy profiles from molecular simulation data is presented. The variational free energy profile (VFEP) method uses the maximum likelihood principle applied to the global free energy profile based on the entire set of simulation data (e.g from multiple biased simulations) that spans the free energy surface. The new method addresses common obstacles in two major problems usually observed in traditional methods for estimating free energy surfaces: the need for overlap in the re-weighting procedure and the problem of data representation. Test cases demonstrate that VFEP outperforms other methods in terms of the amount and sparsity of the data needed to construct the overall free energy profiles. For typical chemical reactions, only ~5 windows and ~20-35 independent data points per window are sufficient to obtain an overall qualitatively correct free energy profile with sampling errors an order of magnitude smaller than the free energy barrier. The proposed approach thus provides a feasible mechanism to quickly construct the global free energy profile and identify free energy barriers and basins in free energy simulations via a robust, variational procedure that determines an analytic representation of the free energy profile without the requirement of numerically unstable histograms or binning procedures. It can serve as a new framework for biased simulations and is suitable to be used together with other methods to tackle with the free energy estimation problem. PMID:23457427
Optimal HRF and smoothing parameters for fMRI time series within an autoregressive modeling framework.

PubMed

Galka, Andreas; Siniatchkin, Michael; Stephani, Ulrich; Groening, Kristina; Wolff, Stephan; Bosch-Bayard, Jorge; Ozaki, Tohru

2010-12-01

The analysis of time series obtained by functional magnetic resonance imaging (fMRI) may be approached by fitting predictive parametric models, such as nearest-neighbor autoregressive models with exogeneous input (NNARX). As a part of the modeling procedure, it is possible to apply instantaneous linear transformations to the data. Spatial smoothing, a common preprocessing step, may be interpreted as such a transformation. The autoregressive parameters may be constrained, such that they provide a response behavior that corresponds to the canonical haemodynamic response function (HRF). We present an algorithm for estimating the parameters of the linear transformations and of the HRF within a rigorous maximum-likelihood framework. Using this approach, an optimal amount of both the spatial smoothing and the HRF can be estimated simultaneously for a given fMRI data set. An example from a motor-task experiment is discussed. It is found that, for this data set, weak, but non-zero, spatial smoothing is optimal. Furthermore, it is demonstrated that activated regions can be estimated within the maximum-likelihood framework.
Do bacterial cell numbers follow a theoretical Poisson distribution? Comparison of experimentally obtained numbers of single cells with random number generation via computer simulation.

PubMed

Koyama, Kento; Hokunan, Hidekazu; Hasegawa, Mayumi; Kawamura, Shuso; Koseki, Shigenobu

2016-12-01

We investigated a bacterial sample preparation procedure for single-cell studies. In the present study, we examined whether single bacterial cells obtained via 10-fold dilution followed a theoretical Poisson distribution. Four serotypes of Salmonella enterica, three serotypes of enterohaemorrhagic Escherichia coli and one serotype of Listeria monocytogenes were used as sample bacteria. An inoculum of each serotype was prepared via a 10-fold dilution series to obtain bacterial cell counts with mean values of one or two. To determine whether the experimentally obtained bacterial cell counts follow a theoretical Poisson distribution, a likelihood ratio test between the experimentally obtained cell counts and Poisson distribution which parameter estimated by maximum likelihood estimation (MLE) was conducted. The bacterial cell counts of each serotype sufficiently followed a Poisson distribution. Furthermore, to examine the validity of the parameters of Poisson distribution from experimentally obtained bacterial cell counts, we compared these with the parameters of a Poisson distribution that were estimated using random number generation via computer simulation. The Poisson distribution parameters experimentally obtained from bacterial cell counts were within the range of the parameters estimated using a computer simulation. These results demonstrate that the bacterial cell counts of each serotype obtained via 10-fold dilution followed a Poisson distribution. The fact that the frequency of bacterial cell counts follows a Poisson distribution at low number would be applied to some single-cell studies with a few bacterial cells. In particular, the procedure presented in this study enables us to develop an inactivation model at the single-cell level that can estimate the variability of survival bacterial numbers during the bacterial death process. Copyright © 2016 Elsevier Ltd. All rights reserved.
VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA

PubMed Central

Garcia, Ramon I.; Ibrahim, Joseph G.; Zhu, Hongtu

2009-01-01

We consider the variable selection problem for a class of statistical models with missing data, including missing covariate and/or response data. We investigate the smoothly clipped absolute deviation penalty (SCAD) and adaptive LASSO and propose a unified model selection and estimation procedure for use in the presence of missing data. We develop a computationally attractive algorithm for simultaneously optimizing the penalized likelihood function and estimating the penalty parameters. Particularly, we propose to use a model selection criterion, called the ICQ statistic, for selecting the penalty parameters. We show that the variable selection procedure based on ICQ automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties. The methodology is very general and can be applied to numerous situations involving missing data, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Simulations are given to demonstrate the methodology and examine the finite sample performance of the variable selection procedures. Melanoma data from a cancer clinical trial is presented to illustrate the proposed methodology. PMID:20336190
Improvements in prevalence trend fitting and incidence estimation in EPP 2013

PubMed Central

Brown, Tim; Bao, Le; Eaton, Jeffrey W.; Hogan, Daniel R.; Mahy, Mary; Marsh, Kimberly; Mathers, Bradley M.; Puckett, Robert

2014-01-01

Objective: Describe modifications to the latest version of the Joint United Nations Programme on AIDS (UNAIDS) Estimation and Projection Package component of Spectrum (EPP 2013) to improve prevalence fitting and incidence trend estimation in national epidemics and global estimates of HIV burden. Methods: Key changes made under the guidance of the UNAIDS Reference Group on Estimates, Modelling and Projections include: availability of a range of incidence calculation models and guidance for selecting a model; a shift to reporting the Bayesian median instead of the maximum likelihood estimate; procedures for comparison and validation against reported HIV and AIDS data; incorporation of national surveys as an integral part of the fitting and calibration procedure, allowing survey trends to inform the fit; improved antenatal clinic calibration procedures in countries without surveys; adjustment of national antiretroviral therapy reports used in the fitting to include only those aged 15–49 years; better estimates of mortality among people who inject drugs; and enhancements to speed fitting. Results: The revised models in EPP 2013 allow closer fits to observed prevalence trend data and reflect improving understanding of HIV epidemics and associated data. Conclusion: Spectrum and EPP continue to adapt to make better use of the existing data sources, incorporate new sources of information in their fitting and validation procedures, and correct for quantifiable biases in inputs as they are identified and understood. These adaptations provide countries with better calibrated estimates of incidence and prevalence, which increase epidemic understanding and provide a solid base for program and policy planning. PMID:25406747
Sparse representation and dictionary learning penalized image reconstruction for positron emission tomography.

PubMed

Chen, Shuhang; Liu, Huafeng; Shi, Pengcheng; Chen, Yunmei

2015-01-21

Accurate and robust reconstruction of the radioactivity concentration is of great importance in positron emission tomography (PET) imaging. Given the Poisson nature of photo-counting measurements, we present a reconstruction framework that integrates sparsity penalty on a dictionary into a maximum likelihood estimator. Patch-sparsity on a dictionary provides the regularization for our effort, and iterative procedures are used to solve the maximum likelihood function formulated on Poisson statistics. Specifically, in our formulation, a dictionary could be trained on CT images, to provide intrinsic anatomical structures for the reconstructed images, or adaptively learned from the noisy measurements of PET. Accuracy of the strategy with very promising application results from Monte-Carlo simulations, and real data are demonstrated.
The effect of prenatal care on birthweight: a full-information maximum likelihood approach.

PubMed

Rous, Jeffrey J; Jewell, R Todd; Brown, Robert W

2004-03-01

This paper uses a full-information maximum likelihood estimation procedure, the Discrete Factor Method, to estimate the relationship between birthweight and prenatal care. This technique controls for the potential biases surrounding both the sample selection of the pregnancy-resolution decision and the endogeneity of prenatal care. In addition, we use the actual number of prenatal care visits; other studies have normally measured prenatal care as the month care is initiated. We estimate a birthweight production function using 1993 data from the US state of Texas. The results underscore the importance of correcting for estimation problems. Specifically, a model that does not control for sample selection and endogeneity overestimates the benefit of an additional visit for women who have relatively few visits. This overestimation may indicate 'positive fetal selection,' i.e., women who did not abort may have healthier babies. Also, a model that does not control for self-selection and endogenity predicts that past 17 visits, an additional visit leads to lower birthweight, while a model that corrects for these estimation problems predicts a positive effect for additional visits. This result shows the effect of mothers with less healthy fetuses making more prenatal care visits, known as 'adverse selection' in prenatal care. Copyright 2003 John Wiley & Sons, Ltd.
A statistically robust EEG re-referencing procedure to mitigate reference effect

PubMed Central

Lepage, Kyle Q.; Kramer, Mark A.; Chu, Catherine J.

2014-01-01

Background The electroencephalogram (EEG) remains the primary tool for diagnosis of abnormal brain activity in clinical neurology and for in vivo recordings of human neurophysiology in neuroscience research. In EEG data acquisition, voltage is measured at positions on the scalp with respect to a reference electrode. When this reference electrode responds to electrical activity or artifact all electrodes are affected. Successful analysis of EEG data often involves re-referencing procedures that modify the recorded traces and seek to minimize the impact of reference electrode activity upon functions of the original EEG recordings. New method We provide a novel, statistically robust procedure that adapts a robust maximum-likelihood type estimator to the problem of reference estimation, reduces the influence of neural activity from the re-referencing operation, and maintains good performance in a wide variety of empirical scenarios. Results The performance of the proposed and existing re-referencing procedures are validated in simulation and with examples of EEG recordings. To facilitate this comparison, channel-to-channel correlations are investigated theoretically and in simulation. Comparison with existing methods The proposed procedure avoids using data contaminated by neural signal and remains unbiased in recording scenarios where physical references, the common average reference (CAR) and the reference estimation standardization technique (REST) are not optimal. Conclusion The proposed procedure is simple, fast, and avoids the potential for substantial bias when analyzing low-density EEG data. PMID:24975291
Validation of a heteroscedastic hazards regression model.

PubMed

Wu, Hong-Dar Isaac; Hsieh, Fushing; Chen, Chen-Hsin

2002-03-01

A Cox-type regression model accommodating heteroscedasticity, with a power factor of the baseline cumulative hazard, is investigated for analyzing data with crossing hazards behavior. Since the approach of partial likelihood cannot eliminate the baseline hazard, an overidentified estimating equation (OEE) approach is introduced in the estimation procedure. It by-product, a model checking statistic, is presented to test for the overall adequacy of the heteroscedastic model. Further, under the heteroscedastic model setting, we propose two statistics to test the proportional hazards assumption. Implementation of this model is illustrated in a data analysis of a cancer clinical trial.
Pearson-type goodness-of-fit test with bootstrap maximum likelihood estimation.

PubMed

Yin, Guosheng; Ma, Yanyuan

2013-01-01

The Pearson test statistic is constructed by partitioning the data into bins and computing the difference between the observed and expected counts in these bins. If the maximum likelihood estimator (MLE) of the original data is used, the statistic generally does not follow a chi-squared distribution or any explicit distribution. We propose a bootstrap-based modification of the Pearson test statistic to recover the chi-squared distribution. We compute the observed and expected counts in the partitioned bins by using the MLE obtained from a bootstrap sample. This bootstrap-sample MLE adjusts exactly the right amount of randomness to the test statistic, and recovers the chi-squared distribution. The bootstrap chi-squared test is easy to implement, as it only requires fitting exactly the same model to the bootstrap data to obtain the corresponding MLE, and then constructs the bin counts based on the original data. We examine the test size and power of the new model diagnostic procedure using simulation studies and illustrate it with a real data set.
Calculation of Weibull strength parameters and Batdorf flow-density constants for volume- and surface-flaw-induced fracture in ceramics

NASA Technical Reports Server (NTRS)

Shantaram, S. Pai; Gyekenyesi, John P.

1989-01-01

The calculation of shape and scale parametes of the two-parameter Weibull distribution is described using the least-squares analysis and maximum likelihood methods for volume- and surface-flaw-induced fracture in ceramics with complete and censored samples. Detailed procedures are given for evaluating 90 percent confidence intervals for maximum likelihood estimates of shape and scale parameters, the unbiased estimates of the shape parameters, and the Weibull mean values and corresponding standard deviations. Furthermore, the necessary steps are described for detecting outliers and for calculating the Kolmogorov-Smirnov and the Anderson-Darling goodness-of-fit statistics and 90 percent confidence bands about the Weibull distribution. It also shows how to calculate the Batdorf flaw-density constants by using the Weibull distribution statistical parameters. The techniques described were verified with several example problems, from the open literature, and were coded in the Structural Ceramics Analysis and Reliability Evaluation (SCARE) design program.
APPROXIMATION AND ESTIMATION OF s-CONCAVE DENSITIES VIA RÉNYI DIVERGENCES.

PubMed

Han, Qiyang; Wellner, Jon A

2016-01-01

In this paper, we study the approximation and estimation of s -concave densities via Rényi divergence. We first show that the approximation of a probability measure Q by an s -concave density exists and is unique via the procedure of minimizing a divergence functional proposed by [ Ann. Statist. 38 (2010) 2998-3027] if and only if Q admits full-dimensional support and a first moment. We also show continuity of the divergence functional in Q : if Q n → Q in the Wasserstein metric, then the projected densities converge in weighted L 1 metrics and uniformly on closed subsets of the continuity set of the limit. Moreover, directional derivatives of the projected densities also enjoy local uniform convergence. This contains both on-the-model and off-the-model situations, and entails strong consistency of the divergence estimator of an s -concave density under mild conditions. One interesting and important feature for the Rényi divergence estimator of an s -concave density is that the estimator is intrinsically related with the estimation of log-concave densities via maximum likelihood methods. In fact, we show that for d = 1 at least, the Rényi divergence estimators for s -concave densities converge to the maximum likelihood estimator of a log-concave density as s ↗ 0. The Rényi divergence estimator shares similar characterizations as the MLE for log-concave distributions, which allows us to develop pointwise asymptotic distribution theory assuming that the underlying density is s -concave.
APPROXIMATION AND ESTIMATION OF s-CONCAVE DENSITIES VIA RÉNYI DIVERGENCES

PubMed Central

Han, Qiyang; Wellner, Jon A.

2017-01-01

In this paper, we study the approximation and estimation of s-concave densities via Rényi divergence. We first show that the approximation of a probability measure Q by an s-concave density exists and is unique via the procedure of minimizing a divergence functional proposed by [Ann. Statist. 38 (2010) 2998–3027] if and only if Q admits full-dimensional support and a first moment. We also show continuity of the divergence functional in Q: if Qn → Q in the Wasserstein metric, then the projected densities converge in weighted L1 metrics and uniformly on closed subsets of the continuity set of the limit. Moreover, directional derivatives of the projected densities also enjoy local uniform convergence. This contains both on-the-model and off-the-model situations, and entails strong consistency of the divergence estimator of an s-concave density under mild conditions. One interesting and important feature for the Rényi divergence estimator of an s-concave density is that the estimator is intrinsically related with the estimation of log-concave densities via maximum likelihood methods. In fact, we show that for d = 1 at least, the Rényi divergence estimators for s-concave densities converge to the maximum likelihood estimator of a log-concave density as s ↗ 0. The Rényi divergence estimator shares similar characterizations as the MLE for log-concave distributions, which allows us to develop pointwise asymptotic distribution theory assuming that the underlying density is s-concave. PMID:28966410
Blind estimation of reverberation time

NASA Astrophysics Data System (ADS)

Ratnam, Rama; Jones, Douglas L.; Wheeler, Bruce C.; O'Brien, William D.; Lansing, Charissa R.; Feng, Albert S.

2003-11-01

The reverberation time (RT) is an important parameter for characterizing the quality of an auditory space. Sounds in reverberant environments are subject to coloration. This affects speech intelligibility and sound localization. Many state-of-the-art audio signal processing algorithms, for example in hearing-aids and telephony, are expected to have the ability to characterize the listening environment, and turn on an appropriate processing strategy accordingly. Thus, a method for characterization of room RT based on passively received microphone signals represents an important enabling technology. Current RT estimators, such as Schroeder's method, depend on a controlled sound source, and thus cannot produce an online, blind RT estimate. Here, a method for estimating RT without prior knowledge of sound sources or room geometry is presented. The diffusive tail of reverberation was modeled as an exponentially damped Gaussian white noise process. The time-constant of the decay, which provided a measure of the RT, was estimated using a maximum-likelihood procedure. The estimates were obtained continuously, and an order-statistics filter was used to extract the most likely RT from the accumulated estimates. The procedure was illustrated for connected speech. Results obtained for simulated and real room data are in good agreement with the real RT values.
Online estimation of room reverberation time

NASA Astrophysics Data System (ADS)

Ratnam, Rama; Jones, Douglas L.; Wheeler, Bruce C.; Feng, Albert S.

2003-04-01

The reverberation time (RT) is an important parameter for characterizing the quality of an auditory space. Sounds in reverberant environments are subject to coloration. This affects speech intelligibility and sound localization. State-of-the-art signal processing algorithms for hearing aids are expected to have the ability to evaluate the characteristics of the listening environment and turn on an appropriate processing strategy accordingly. Thus, a method for the characterization of room RT based on passively received microphone signals represents an important enabling technology. Current RT estimators, such as Schroeder's method or regression, depend on a controlled sound source, and thus cannot produce an online, blind RT estimate. Here, we describe a method for estimating RT without prior knowledge of sound sources or room geometry. The diffusive tail of reverberation was modeled as an exponentially damped Gaussian white noise process. The time constant of the decay, which provided a measure of the RT, was estimated using a maximum-likelihood procedure. The estimates were obtained continuously, and an order-statistics filter was used to extract the most likely RT from the accumulated estimates. The procedure was illustrated for connected speech. Results obtained for simulated and real room data are in good agreement with the real RT values.
Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data.

PubMed

Jeon, Jihyoun; Hsu, Li; Gorfine, Malka

2012-07-01

Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
An analysis and demonstration of clock synchronization by VLBI

NASA Technical Reports Server (NTRS)

Hurd, W. J.

1972-01-01

A prototype of a semireal-time system for synchronizing the DSN station clocks by radio interferometry was successfully demonstrated. The system utilized an approximate maximum likelihood estimation procedure for processing the data, thereby achieving essentially optimum time synchronization estimates for a given amount of data, or equivalently, minimizing the amount of data required for reliable estimation. Synchronization accuracies as good as 100 nsec rms were achieved between DSS 11 and DSS 12, both at Goldstone, California. The accuracy can be improved by increasing the system bandwidth until the fundamental limitations due to position uncertainties of baseline and source and atmospheric effects are reached. These limitations are under ten nsec for transcontinental baselines.

New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.

PubMed

Guindon, Stéphane; Dufayard, Jean-François; Lefort, Vincent; Anisimova, Maria; Hordijk, Wim; Gascuel, Olivier

2010-05-01

PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing nearest neighbor interchanges to improve a reasonable starting tree topology. Since the original publication (Guindon S., Gascuel O. 2003. A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696-704), PhyML has been widely used (>2500 citations in ISI Web of Science) because of its simplicity and a fair compromise between accuracy and speed. In the meantime, research around PhyML has continued, and this article describes the new algorithms and methods implemented in the program. First, we introduce a new algorithm to search the tree space with user-defined intensity using subtree pruning and regrafting topological moves. The parsimony criterion is used here to filter out the least promising topology modifications with respect to the likelihood function. The analysis of a large collection of real nucleotide and amino acid data sets of various sizes demonstrates the good performance of this method. Second, we describe a new test to assess the support of the data for internal branches of a phylogeny. This approach extends the recently proposed approximate likelihood-ratio test and relies on a nonparametric, Shimodaira-Hasegawa-like procedure. A detailed analysis of real alignments sheds light on the links between this new approach and the more classical nonparametric bootstrap method. Overall, our tests show that the last version (3.0) of PhyML is fast, accurate, stable, and ready to use. A Web server and binary files are available from http://www.atgc-montpellier.fr/phyml/.
Reliability Stress-Strength Models for Dependent Observations with Applications in Clinical Trials

NASA Technical Reports Server (NTRS)

Kushary, Debashis; Kulkarni, Pandurang M.

1995-01-01

We consider the applications of stress-strength models in studies involving clinical trials. When studying the effects and side effects of certain procedures (treatments), it is often the case that observations are correlated due to subject effect, repeated measurements and observing many characteristics simultaneously. We develop maximum likelihood estimator (MLE) and uniform minimum variance unbiased estimator (UMVUE) of the reliability which in clinical trial studies could be considered as the chances of increased side effects due to a particular procedure compared to another. The results developed apply to both univariate and multivariate situations. Also, for the univariate situations we develop simple to use lower confidence bounds for the reliability. Further, we consider the cases when both stress and strength constitute time dependent processes. We define the future reliability and obtain methods of constructing lower confidence bounds for this reliability. Finally, we conduct simulation studies to evaluate all the procedures developed and also to compare the MLE and the UMVUE.
Real-time realizations of the Bayesian Infrasonic Source Localization Method

NASA Astrophysics Data System (ADS)

Pinsky, V.; Arrowsmith, S.; Hofstetter, A.; Nippress, A.

2015-12-01

The Bayesian Infrasonic Source Localization method (BISL), introduced by Mordak et al. (2010) and upgraded by Marcillo et al. (2014) is destined for the accurate estimation of the atmospheric event origin at local, regional and global scales by the seismic and infrasonic networks and arrays. The BISL is based on probabilistic models of the source-station infrasonic signal propagation time, picking time and azimuth estimate merged with a prior knowledge about celerity distribution. It requires at each hypothetical source location, integration of the product of the corresponding source-station likelihood functions multiplied by a prior probability density function of celerity over the multivariate parameter space. The present BISL realization is generally time-consuming procedure based on numerical integration. The computational scheme proposed simplifies the target function so that integrals are taken exactly and are represented via standard functions. This makes the procedure much faster and realizable in real-time without practical loss of accuracy. The procedure executed as PYTHON-FORTRAN code demonstrates high performance on a set of the model and real data.
Fast automated analysis of strong gravitational lenses with convolutional neural networks.

PubMed

Hezaveh, Yashar D; Levasseur, Laurence Perreault; Marshall, Philip J

2017-08-30

Quantifying image distortions caused by strong gravitational lensing-the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures-and estimating the corresponding matter distribution of these structures (the 'gravitational lens') has primarily been performed using maximum likelihood modelling of observations. This procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physical processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. Here we report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the 'singular isothermal ellipsoid' density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.
Variance Difference between Maximum Likelihood Estimation Method and Expected A Posteriori Estimation Method Viewed from Number of Test Items

ERIC Educational Resources Information Center

Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S.

2016-01-01

The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
Finding the breech: Influence of breech presentation on mode of delivery based on timing of diagnosis, attempt at external cephalic version, and provider success with version.

PubMed

Andrews, Suzanne; Leeman, Lawrence; Yonke, Nicole

2017-09-01

Breech presentation affects 3-4% of pregnancies at term and malpresentation is the primary indication for 10-15% of cesarean deliveries. External cephalic version is an effective intervention that can decrease the need for cesarean delivery; however, timely identification of breech presentation is required. We hypothesized that women with a fetus in a breech presentation that is diagnosed after 38 weeks' estimated gestational age have a decreased likelihood of external cephalic version attempted and an increased likelihood of cesarean delivery. This was a retrospective cohort study. A chart review was performed for 251 women with breech presentation at term presenting to our tertiary referral university hospital for external cephalic version, cesarean for breech presentation, or vaginal breech delivery. Vaginal delivery was significantly more likely (31.1% vs 12.5%; P<.01) in women with breech presentation diagnosed before 38 weeks' estimated gestational age as external cephalic version was offered, and subsequently attempted in a greater proportion of women diagnosed before 38 weeks. External cephalic version was more successful when performed by physicians with greater procedural volume during the 3.5 year period of the study (59.1% for providers performing at least 10 procedures vs 31.3% if performing fewer than 10 procedures, P<.01). Results support the need for interventions to increase timely diagnosis of breech presentation as well as improved patient counseling and use of experienced providers for external cephalic version. © 2017 Wiley Periodicals, Inc.
Inventory and mapping of flood inundation using interactive digital image analysis techniques

USGS Publications Warehouse

Rohde, Wayne G.; Nelson, Charles A.; Taranik, J.V.

1979-01-01

LANDSAT digital data and color infra-red photographs were used in a multiphase sampling scheme to estimate the area of agricultural land affected by a flood. The LANDSAT data were classified with a maximum likelihood algorithm. Stratification of the LANDSAT data, prior to classification, greatly reduced misclassification errors. The classification results were used to prepare a map overlay showing the areal extent of flooding. These data also provided statistics required to estimate sample size in a two phase sampling scheme, and provided quick, accurate estimates of areas flooded for the first phase. The measurements made in the second phase, based on ground data and photo-interpretation, were used with two phase sampling statistics to estimate the area of agricultural land affected by flooding These results show that LANDSAT digital data can be used to prepare map overlays showing the extent of flooding on agricultural land and, with two phase sampling procedures, can provide acreage estimates with sampling errors of about 5 percent. This procedure provides a technique for rapidly assessing the areal extent of flood conditions on agricultural land and would provide a basis for designing a sampling framework to estimate the impact of flooding on crop production.
A hybrid model for combining case-control and cohort studies in systematic reviews of diagnostic tests

PubMed Central

Chen, Yong; Liu, Yulun; Ning, Jing; Cormier, Janice; Chu, Haitao

2014-01-01

Systematic reviews of diagnostic tests often involve a mixture of case-control and cohort studies. The standard methods for evaluating diagnostic accuracy only focus on sensitivity and specificity and ignore the information on disease prevalence contained in cohort studies. Consequently, such methods cannot provide estimates of measures related to disease prevalence, such as population averaged or overall positive and negative predictive values, which reflect the clinical utility of a diagnostic test. In this paper, we propose a hybrid approach that jointly models the disease prevalence along with the diagnostic test sensitivity and specificity in cohort studies, and the sensitivity and specificity in case-control studies. In order to overcome the potential computational difficulties in the standard full likelihood inference of the proposed hybrid model, we propose an alternative inference procedure based on the composite likelihood. Such composite likelihood based inference does not suffer computational problems and maintains high relative efficiency. In addition, it is more robust to model mis-specifications compared to the standard full likelihood inference. We apply our approach to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma. PMID:25897179
Diallel analysis for sex-linked and maternal effects.

PubMed

Zhu, J; Weir, B S

1996-01-01

Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(θ), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.
The recursive maximum likelihood proportion estimator: User's guide and test results

NASA Technical Reports Server (NTRS)

Vanrooy, D. L.

1976-01-01

Implementation of the recursive maximum likelihood proportion estimator is described. A user's guide to programs as they currently exist on the IBM 360/67 at LARS, Purdue is included, and test results on LANDSAT data are described. On Hill County data, the algorithm yields results comparable to the standard maximum likelihood proportion estimator.
Regression analysis of mixed recurrent-event and panel-count data

PubMed Central

Zhu, Liang; Tong, Xinwei; Sun, Jianguo; Chen, Manhua; Srivastava, Deo Kumar; Leisenring, Wendy; Robison, Leslie L.

2014-01-01

In event history studies concerning recurrent events, two types of data have been extensively discussed. One is recurrent-event data (Cook and Lawless, 2007. The Analysis of Recurrent Event Data. New York: Springer), and the other is panel-count data (Zhao and others, 2010. Nonparametric inference based on panel-count data. Test 20, 1–42). In the former case, all study subjects are monitored continuously; thus, complete information is available for the underlying recurrent-event processes of interest. In the latter case, study subjects are monitored periodically; thus, only incomplete information is available for the processes of interest. In reality, however, a third type of data could occur in which some study subjects are monitored continuously, but others are monitored periodically. When this occurs, we have mixed recurrent-event and panel-count data. This paper discusses regression analysis of such mixed data and presents two estimation procedures for the problem. One is a maximum likelihood estimation procedure, and the other is an estimating equation procedure. The asymptotic properties of both resulting estimators of regression parameters are established. Also, the methods are applied to a set of mixed recurrent-event and panel-count data that arose from a Childhood Cancer Survivor Study and motivated this investigation. PMID:24648408
Condition Number Regularized Covariance Estimation*

PubMed Central

Won, Joong-Ho; Lim, Johan; Kim, Seung-Jean; Rajaratnam, Bala

2012-01-01

Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the “large p small n” setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required. PMID:23730197
Condition Number Regularized Covariance Estimation.

PubMed

Won, Joong-Ho; Lim, Johan; Kim, Seung-Jean; Rajaratnam, Bala

2013-06-01

Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the "large p small n " setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required.
Using CV-GLUE procedure in analysis of wetland model predictive uncertainty.

PubMed

Huang, Chun-Wei; Lin, Yu-Pin; Chiang, Li-Chi; Wang, Yung-Chieh

2014-07-01

This study develops a procedure that is related to Generalized Likelihood Uncertainty Estimation (GLUE), called the CV-GLUE procedure, for assessing the predictive uncertainty that is associated with different model structures with varying degrees of complexity. The proposed procedure comprises model calibration, validation, and predictive uncertainty estimation in terms of a characteristic coefficient of variation (characteristic CV). The procedure first performed two-stage Monte-Carlo simulations to ensure predictive accuracy by obtaining behavior parameter sets, and then the estimation of CV-values of the model outcomes, which represent the predictive uncertainties for a model structure of interest with its associated behavior parameter sets. Three commonly used wetland models (the first-order K-C model, the plug flow with dispersion model, and the Wetland Water Quality Model; WWQM) were compared based on data that were collected from a free water surface constructed wetland with paddy cultivation in Taipei, Taiwan. The results show that the first-order K-C model, which is simpler than the other two models, has greater predictive uncertainty. This finding shows that predictive uncertainty does not necessarily increase with the complexity of the model structure because in this case, the more simplistic representation (first-order K-C model) of reality results in a higher uncertainty in the prediction made by the model. The CV-GLUE procedure is suggested to be a useful tool not only for designing constructed wetlands but also for other aspects of environmental management. Copyright © 2014 Elsevier Ltd. All rights reserved.
Quasi- and pseudo-maximum likelihood estimators for discretely observed continuous-time Markov branching processes

PubMed Central

Chen, Rui; Hyrien, Ollivier

2011-01-01

This article deals with quasi- and pseudo-likelihood estimation in a class of continuous-time multi-type Markov branching processes observed at discrete points in time. “Conventional” and conditional estimation are discussed for both approaches. We compare their properties and identify situations where they lead to asymptotically equivalent estimators. Both approaches possess robustness properties, and coincide with maximum likelihood estimation in some cases. Quasi-likelihood functions involving only linear combinations of the data may be unable to estimate all model parameters. Remedial measures exist, including the resort either to non-linear functions of the data or to conditioning the moments on appropriate sigma-algebras. The method of pseudo-likelihood may also resolve this issue. We investigate the properties of these approaches in three examples: the pure birth process, the linear birth-and-death process, and a two-type process that generalizes the previous two examples. Simulations studies are conducted to evaluate performance in finite samples. PMID:21552356
An algorithm for computing moments-based flood quantile estimates when historical flood information is available

USGS Publications Warehouse

Cohn, T.A.; Lane, W.L.; Baier, W.G.

1997-01-01

This paper presents the expected moments algorithm (EMA), a simple and efficient method for incorporating historical and paleoflood information into flood frequency studies. EMA can utilize three types of at-site flood information: systematic stream gage record; information about the magnitude of historical floods; and knowledge of the number of years in the historical period when no large flood occurred. EMA employs an iterative procedure to compute method-of-moments parameter estimates. Initial parameter estimates are calculated from systematic stream gage data. These moments are then updated by including the measured historical peaks and the expected moments, given the previously estimated parameters, of the below-threshold floods from the historical period. The updated moments result in new parameter estimates, and the last two steps are repeated until the algorithm converges. Monte Carlo simulations compare EMA, Bulletin 17B's [United States Water Resources Council, 1982] historically weighted moments adjustment, and maximum likelihood estimators when fitting the three parameters of the log-Pearson type III distribution. These simulations demonstrate that EMA is more efficient than the Bulletin 17B method, and that it is nearly as efficient as maximum likelihood estimation (MLE). The experiments also suggest that EMA has two advantages over MLE when dealing with the log-Pearson type III distribution: It appears that EMA estimates always exist and that they are unique, although neither result has been proven. EMA can be used with binomial or interval-censored data and with any distributional family amenable to method-of-moments estimation.
An algorithm for computing moments-based flood quantile estimates when historical flood information is available

NASA Astrophysics Data System (ADS)

Cohn, T. A.; Lane, W. L.; Baier, W. G.

This paper presents the expected moments algorithm (EMA), a simple and efficient method for incorporating historical and paleoflood information into flood frequency studies. EMA can utilize three types of at-site flood information: systematic stream gage record; information about the magnitude of historical floods; and knowledge of the number of years in the historical period when no large flood occurred. EMA employs an iterative procedure to compute method-of-moments parameter estimates. Initial parameter estimates are calculated from systematic stream gage data. These moments are then updated by including the measured historical peaks and the expected moments, given the previously estimated parameters, of the below-threshold floods from the historical period. The updated moments result in new parameter estimates, and the last two steps are repeated until the algorithm converges. Monte Carlo simulations compare EMA, Bulletin 17B's [United States Water Resources Council, 1982] historically weighted moments adjustment, and maximum likelihood estimators when fitting the three parameters of the log-Pearson type III distribution. These simulations demonstrate that EMA is more efficient than the Bulletin 17B method, and that it is nearly as efficient as maximum likelihood estimation (MLE). The experiments also suggest that EMA has two advantages over MLE when dealing with the log-Pearson type III distribution: It appears that EMA estimates always exist and that they are unique, although neither result has been proven. EMA can be used with binomial or interval-censored data and with any distributional family amenable to method-of-moments estimation.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Blaut, Arkadiusz; Babak, Stanislav; Krolak, Andrzej

We present data analysis methods used in the detection and estimation of parameters of gravitational-wave signals from the white dwarf binaries in the mock LISA data challenge. Our main focus is on the analysis of challenge 3.1, where the gravitational-wave signals from more than 6x10{sup 7} Galactic binaries were added to the simulated Gaussian instrumental noise. The majority of the signals at low frequencies are not resolved individually. The confusion between the signals is strongly reduced at frequencies above 5 mHz. Our basic data analysis procedure is the maximum likelihood detection method. We filter the data through the template bankmore » at the first step of the search, then we refine parameters using the Nelder-Mead algorithm, we remove the strongest signal found and we repeat the procedure. We detect reliably and estimate parameters accurately of more than ten thousand signals from white dwarf binaries.« less
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15

ERIC Educational Resources Information Center

Zhang, Jinming

2005-01-01

Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Information matrix estimation procedures for cognitive diagnostic models.

PubMed

Liu, Yanlou; Xin, Tao; Andersson, Björn; Tian, Wei

2018-03-06

Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and structural parameters. The relationships between the observed information matrix, the empirical cross-product information matrix, the sandwich-type covariance matrix and the two approaches proposed by de la Torre (2009, J. Educ. Behav. Stat., 34, 115) are discussed. Simulation results show that, for a correctly specified CDM and Q-matrix or with a slightly misspecified probability model, the observed information matrix and the sandwich-type covariance matrix exhibit good performance with respect to providing consistent standard errors of item parameter estimates. However, with substantial model misspecification only the sandwich-type covariance matrix exhibits robust performance. © 2018 The British Psychological Society.

Fast maximum likelihood estimation of mutation rates using a birth-death process.

PubMed

Wu, Xiaowei; Zhu, Hongxiao

2015-02-07

Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
On Muthen's Maximum Likelihood for Two-Level Covariance Structure Models

ERIC Educational Resources Information Center

Yuan, Ke-Hai; Hayashi, Kentaro

2005-01-01

Data in social and behavioral sciences are often hierarchically organized. Special statistical procedures that take into account the dependence of such observations have been developed. Among procedures for 2-level covariance structure analysis, Muthen's maximum likelihood (MUML) has the advantage of easier computation and faster convergence. When…
An alternative method to measure the likelihood of a financial crisis in an emerging market

NASA Astrophysics Data System (ADS)

Özlale, Ümit; Metin-Özcan, Kıvılcım

2007-07-01

This paper utilizes an early warning system in order to measure the likelihood of a financial crisis in an emerging market economy. We introduce a methodology, where we can both obtain a likelihood series and analyze the time-varying effects of several macroeconomic variables on this likelihood. Since the issue is analyzed in a non-linear state space framework, the extended Kalman filter emerges as the optimal estimation algorithm. Taking the Turkish economy as our laboratory, the results indicate that both the derived likelihood measure and the estimated time-varying parameters are meaningful and can successfully explain the path that the Turkish economy had followed between 2000 and 2006. The estimated parameters also suggest that overvalued domestic currency, current account deficit and the increase in the default risk increase the likelihood of having an economic crisis in the economy. Overall, the findings in this paper suggest that the estimation methodology introduced in this paper can also be applied to other emerging market economies as well.
F-111C Flight Data Reduction and Analysis Procedures

DTIC Science & Technology

1990-12-01

BPHI NO 24 BTHE YES 25 BPSI NO 26 BH YES 27 LVEL NO 28 LBET NO 29 LALP YES 30 LPHI NO 31 LTHE NO 32 LPSI NO 33 LH NO 34 TABLE 2 INPUTS I Ax YES 2 Av NO...03 * 51 IJ Appendix G - A priori Data from Six Degree of Free- dom Flight Dynamic Model The six degree of freedom flight dynamic mathematical model of...Estimated Mathematical mode response - > of aircraft !Gauss- Maximum " Newton --- likelihood 4,computational cost Salgorithm function Maximum
A Regional Analysis of Non-Methane Hydrocarbons And Meteorology of The Rural Southeast United States

DTIC Science & Technology

1996-01-01

Zt is an ARIMA time series. This is a typical regression model , except that it allows for autocorrelation in the error term Z. In this work, an ARMA...data=folder; var residual; run; II Statistical output of 1992 regression model on 1993 ozone data ARIMA Procedure Maximum Likelihood Estimation Approx...at each of the sites, and to show the effect of synoptic meteorology on high ozone by examining NOAA daily weather maps and climatic data
Maximum likelihood estimation of signal-to-noise ratio and combiner weight

NASA Technical Reports Server (NTRS)

Kalson, S.; Dolinar, S. J.

1986-01-01

An algorithm for estimating signal to noise ratio and combiner weight parameters for a discrete time series is presented. The algorithm is based upon the joint maximum likelihood estimate of the signal and noise power. The discrete-time series are the sufficient statistics obtained after matched filtering of a biphase modulated signal in additive white Gaussian noise, before maximum likelihood decoding is performed.
Comparison of Maximum Likelihood Estimation Approach and Regression Approach in Detecting Quantitative Trait Lco Using RAPD Markers

Treesearch

Changren Weng; Thomas L. Kubisiak; C. Dana Nelson; James P. Geaghan; Michael Stine

1999-01-01

Single marker regression and single marker maximum likelihood estimation were tied to detect quantitative trait loci (QTLs) controlling the early height growth of longleaf pine and slash pine using a ((longleaf pine x slash pine) x slash pine) BC, population consisting of 83 progeny. Maximum likelihood estimation was found to be more power than regression and could...
Estimation of the ARNO model baseflow parameters using daily streamflow data

NASA Astrophysics Data System (ADS)

Abdulla, F. A.; Lettenmaier, D. P.; Liang, Xu

1999-09-01

An approach is described for estimation of baseflow parameters of the ARNO model, using historical baseflow recession sequences extracted from daily streamflow records. This approach allows four of the model parameters to be estimated without rainfall data, and effectively facilitates partitioning of the parameter estimation procedure so that parsimonious search procedures can be used to estimate the remaining storm response parameters separately. Three methods of optimization are evaluated for estimation of four baseflow parameters. These methods are the downhill Simplex (S), Simulated Annealing combined with the Simplex method (SA) and Shuffled Complex Evolution (SCE). These estimation procedures are explored in conjunction with four objective functions: (1) ordinary least squares; (2) ordinary least squares with Box-Cox transformation; (3) ordinary least squares on prewhitened residuals; (4) ordinary least squares applied to prewhitened with Box-Cox transformation of residuals. The effects of changing the seed random generator for both SA and SCE methods are also explored, as are the effects of the bounds of the parameters. Although all schemes converge to the same values of the objective function, SCE method was found to be less sensitive to these issues than both the SA and the Simplex schemes. Parameter uncertainty and interactions are investigated through estimation of the variance-covariance matrix and confidence intervals. As expected the parameters were found to be correlated and the covariance matrix was found to be not diagonal. Furthermore, the linearized confidence interval theory failed for about one-fourth of the catchments while the maximum likelihood theory did not fail for any of the catchments.
Fast automated analysis of strong gravitational lenses with convolutional neural networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hezaveh, Yashar D.; Levasseur, Laurence Perreault; Marshall, Philip J.

Quantifying image distortions caused by strong gravitational lensing—the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures—and estimating the corresponding matter distribution of these structures (the ‘gravitational lens’) has primarily been performed using maximum likelihood modelling of observations. Our procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physicalmore » processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. We report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the ‘singular isothermal ellipsoid’ density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.« less
Fast automated analysis of strong gravitational lenses with convolutional neural networks

DOE PAGES

Hezaveh, Yashar D.; Levasseur, Laurence Perreault; Marshall, Philip J.

2017-08-30

Quantifying image distortions caused by strong gravitational lensing—the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures—and estimating the corresponding matter distribution of these structures (the ‘gravitational lens’) has primarily been performed using maximum likelihood modelling of observations. Our procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physicalmore » processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. We report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the ‘singular isothermal ellipsoid’ density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.« less
Fast automated analysis of strong gravitational lenses with convolutional neural networks

NASA Astrophysics Data System (ADS)

Hezaveh, Yashar D.; Levasseur, Laurence Perreault; Marshall, Philip J.

2017-08-01

Quantifying image distortions caused by strong gravitational lensing—the formation of multiple images of distant sources due to the deflection of their light by the gravity of intervening structures—and estimating the corresponding matter distribution of these structures (the ‘gravitational lens’) has primarily been performed using maximum likelihood modelling of observations. This procedure is typically time- and resource-consuming, requiring sophisticated lensing codes, several data preparation steps, and finding the maximum likelihood model parameters in a computationally expensive process with downhill optimizers. Accurate analysis of a single gravitational lens can take up to a few weeks and requires expert knowledge of the physical processes and methods involved. Tens of thousands of new lenses are expected to be discovered with the upcoming generation of ground and space surveys. Here we report the use of deep convolutional neural networks to estimate lensing parameters in an extremely fast and automated way, circumventing the difficulties that are faced by maximum likelihood methods. We also show that the removal of lens light can be made fast and automated using independent component analysis of multi-filter imaging data. Our networks can recover the parameters of the ‘singular isothermal ellipsoid’ density profile, which is commonly used to model strong lensing systems, with an accuracy comparable to the uncertainties of sophisticated models but about ten million times faster: 100 systems in approximately one second on a single graphics processing unit. These networks can provide a way for non-experts to obtain estimates of lensing parameters for large samples of data.
Investigating the Impact of Uncertainty about Item Parameters on Ability Estimation

ERIC Educational Resources Information Center

Zhang, Jinming; Xie, Minge; Song, Xiaolan; Lu, Ting

2011-01-01

Asymptotic expansions of the maximum likelihood estimator (MLE) and weighted likelihood estimator (WLE) of an examinee's ability are derived while item parameter estimators are treated as covariates measured with error. The asymptotic formulae present the amount of bias of the ability estimators due to the uncertainty of item parameter estimators.…
Calculation of Weibull strength parameters and Batdorf flow-density constants for volume- and surface-flaw-induced fracture in ceramics

NASA Technical Reports Server (NTRS)

Pai, Shantaram S.; Gyekenyesi, John P.

1988-01-01

The calculation of shape and scale parameters of the two-parameter Weibull distribution is described using the least-squares analysis and maximum likelihood methods for volume- and surface-flaw-induced fracture in ceramics with complete and censored samples. Detailed procedures are given for evaluating 90 percent confidence intervals for maximum likelihood estimates of shape and scale parameters, the unbiased estimates of the shape parameters, and the Weibull mean values and corresponding standard deviations. Furthermore, the necessary steps are described for detecting outliers and for calculating the Kolmogorov-Smirnov and the Anderson-Darling goodness-of-fit statistics and 90 percent confidence bands about the Weibull distribution. It also shows how to calculate the Batdorf flaw-density constants by uing the Weibull distribution statistical parameters. The techniques described were verified with several example problems, from the open literature, and were coded. The techniques described were verified with several example problems from the open literature, and were coded in the Structural Ceramics Analysis and Reliability Evaluation (SCARE) design program.
On the use of Bayesian Monte-Carlo in evaluation of nuclear data

NASA Astrophysics Data System (ADS)

De Saint Jean, Cyrille; Archier, Pascal; Privas, Edwin; Noguere, Gilles

2017-09-01

As model parameters, necessary ingredients of theoretical models, are not always predicted by theory, a formal mathematical framework associated to the evaluation work is needed to obtain the best set of parameters (resonance parameters, optical models, fission barrier, average width, multigroup cross sections) with Bayesian statistical inference by comparing theory to experiment. The formal rule related to this methodology is to estimate the posterior density probability function of a set of parameters by solving an equation of the following type: pdf(posterior) ˜ pdf(prior) × a likelihood function. A fitting procedure can be seen as an estimation of the posterior density probability of a set of parameters (referred as x→?) knowing a prior information on these parameters and a likelihood which gives the probability density function of observing a data set knowing x→?. To solve this problem, two major paths could be taken: add approximations and hypothesis and obtain an equation to be solved numerically (minimum of a cost function or Generalized least Square method, referred as GLS) or use Monte-Carlo sampling of all prior distributions and estimate the final posterior distribution. Monte Carlo methods are natural solution for Bayesian inference problems. They avoid approximations (existing in traditional adjustment procedure based on chi-square minimization) and propose alternative in the choice of probability density distribution for priors and likelihoods. This paper will propose the use of what we are calling Bayesian Monte Carlo (referred as BMC in the rest of the manuscript) in the whole energy range from thermal, resonance and continuum range for all nuclear reaction models at these energies. Algorithms will be presented based on Monte-Carlo sampling and Markov chain. The objectives of BMC are to propose a reference calculation for validating the GLS calculations and approximations, to test probability density distributions effects and to provide the framework of finding global minimum if several local minimums exist. Application to resolved resonance, unresolved resonance and continuum evaluation as well as multigroup cross section data assimilation will be presented.
Maximum likelihood estimation of finite mixture model for economic data

NASA Astrophysics Data System (ADS)

Phoong, Seuk-Yen; Ismail, Mohd Tahir

2014-06-01

Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression

PubMed Central

Shen, Jianzhao; Gao, Sujuan

2010-01-01

In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

PubMed

Shen, Jianzhao; Gao, Sujuan

2008-10-01

In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Maximum Likelihood Estimation with Emphasis on Aircraft Flight Data

NASA Technical Reports Server (NTRS)

Iliff, K. W.; Maine, R. E.

1985-01-01

Accurate modeling of flexible space structures is an important field that is currently under investigation. Parameter estimation, using methods such as maximum likelihood, is one of the ways that the model can be improved. The maximum likelihood estimator has been used to extract stability and control derivatives from flight data for many years. Most of the literature on aircraft estimation concentrates on new developments and applications, assuming familiarity with basic estimation concepts. Some of these basic concepts are presented. The maximum likelihood estimator and the aircraft equations of motion that the estimator uses are briefly discussed. The basic concepts of minimization and estimation are examined for a simple computed aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to help illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Specific examples of estimation of structural dynamics are included. Some of the major conclusions for the computed example are also developed for the analysis of flight data.
Channel Training for Analog FDD Repeaters: Optimal Estimators and Cramér-Rao Bounds

NASA Astrophysics Data System (ADS)

Wesemann, Stefan; Marzetta, Thomas L.

2017-12-01

For frequency division duplex channels, a simple pilot loop-back procedure has been proposed that allows the estimation of the UL & DL channels at an antenna array without relying on any digital signal processing at the terminal side. For this scheme, we derive the maximum likelihood (ML) estimators for the UL & DL channel subspaces, formulate the corresponding Cram\\'er-Rao bounds and show the asymptotic efficiency of both (SVD-based) estimators by means of Monte Carlo simulations. In addition, we illustrate how to compute the underlying (rank-1) SVD with quadratic time complexity by employing the power iteration method. To enable power control for the data transmission, knowledge of the channel gains is needed. Assuming that the UL & DL channels have on average the same gain, we formulate the ML estimator for the channel norm, and illustrate its robustness against strong noise by means of simulations.
Estimating the exceedance probability of rain rate by logistic regression

NASA Technical Reports Server (NTRS)

Chiu, Long S.; Kedem, Benjamin

1990-01-01

Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

ERIC Educational Resources Information Center

Atar, Burcu; Kamata, Akihito

2011-01-01

The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Identification of treatment responders based on multiple longitudinal outcomes with applications to multiple sclerosis patients.

PubMed

Kondo, Yumi; Zhao, Yinshan; Petkau, John

2017-05-30

Identification of treatment responders is a challenge in comparative studies where treatment efficacy is measured by multiple longitudinally collected continuous and count outcomes. Existing procedures often identify responders on the basis of only a single outcome. We propose a novel multiple longitudinal outcome mixture model that assumes that, conditionally on a cluster label, each longitudinal outcome is from a generalized linear mixed effect model. We utilize a Monte Carlo expectation-maximization algorithm to obtain the maximum likelihood estimates of our high-dimensional model and classify patients according to their estimated posterior probability of being a responder. We demonstrate the flexibility of our novel procedure on two multiple sclerosis clinical trial datasets with distinct data structures. Our simulation study shows that incorporating multiple outcomes improves the responder identification performance; this can occur even if some of the outcomes are ineffective. Our general procedure facilitates the identification of responders who are comprehensively defined by multiple outcomes from various distributions. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
On the Relationships between Jeffreys Modal and Weighted Likelihood Estimation of Ability under Logistic IRT Models

ERIC Educational Resources Information Center

Magis, David; Raiche, Gilles

2012-01-01

This paper focuses on two estimators of ability with logistic item response theory models: the Bayesian modal (BM) estimator and the weighted likelihood (WL) estimator. For the BM estimator, Jeffreys' prior distribution is considered, and the corresponding estimator is referred to as the Jeffreys modal (JM) estimator. It is established that under…
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood

NASA Astrophysics Data System (ADS)

Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim

2017-04-01

Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Multiple-hit parameter estimation in monolithic detectors.

PubMed

Hunter, William C J; Barrett, Harrison H; Lewellen, Tom K; Miyaoka, Robert S

2013-02-01

We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%-12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied.
An evaluation of percentile and maximum likelihood estimators of weibull paremeters

Treesearch

Stanley J. Zarnoch; Tommy R. Dell

1985-01-01

Two methods of estimating the three-parameter Weibull distribution were evaluated by computer simulation and field data comparison. Maximum likelihood estimators (MLB) with bias correction were calculated with the computer routine FITTER (Bailey 1974); percentile estimators (PCT) were those proposed by Zanakis (1979). The MLB estimators had superior smaller bias and...
The Equivalence of Two Methods of Parameter Estimation for the Rasch Model.

ERIC Educational Resources Information Center

Blackwood, Larry G.; Bradley, Edwin L.

1989-01-01

Two methods of estimating parameters in the Rasch model are compared. The equivalence of likelihood estimations from the model of G. J. Mellenbergh and P. Vijn (1981) and from usual unconditional maximum likelihood (UML) estimation is demonstrated. Mellenbergh and Vijn's model is a convenient method of calculating UML estimates. (SLD)
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.

PubMed

Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman

2017-03-01

We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Planck intermediate results: XLVI. Reduction of large-scale systematic effects in HFI polarization maps and estimation of the reionization optical depth

DOE PAGES

Aghanim, N.; Ashdown, M.; Aumont, J.; ...

2016-12-12

This study describes the identification, modelling, and removal of previously unexplained systematic effects in the polarization data of the Planck High Frequency Instrument (HFI) on large angular scales, including new mapmaking and calibration procedures, new and more complete end-to-end simulations, and a set of robust internal consistency checks on the resulting maps. These maps, at 100, 143, 217, and 353 GHz, are early versions of those that will be released in final form later in 2016. The improvements allow us to determine the cosmic reionization optical depth τ using, for the first time, the low-multipole EE data from HFI, reducingmore » significantly the central value and uncertainty, and hence the upper limit. Two different likelihood procedures are used to constrain τ from two estimators of the CMB E- and B-mode angular power spectra at 100 and 143 GHz, after debiasing the spectra from a small remaining systematic contamination. These all give fully consistent results. A further consistency test is performed using cross-correlations derived from the Low Frequency Instrument maps of the Planck 2015 data release and the new HFI data. For this purpose, end-to-end analyses of systematic effects from the two instruments are used to demonstrate the near independence of their dominant systematic error residuals. The tightest result comes from the HFI-based τ posterior distribution using the maximum likelihood power spectrum estimator from EE data only, giving a value 0.055 ± 0.009. Finally, in a companion paper these results are discussed in the context of the best-fit PlanckΛCDM cosmological model and recent models of reionization.« less
Planck intermediate results: XLVI. Reduction of large-scale systematic effects in HFI polarization maps and estimation of the reionization optical depth

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aghanim, N.; Ashdown, M.; Aumont, J.

This study describes the identification, modelling, and removal of previously unexplained systematic effects in the polarization data of the Planck High Frequency Instrument (HFI) on large angular scales, including new mapmaking and calibration procedures, new and more complete end-to-end simulations, and a set of robust internal consistency checks on the resulting maps. These maps, at 100, 143, 217, and 353 GHz, are early versions of those that will be released in final form later in 2016. The improvements allow us to determine the cosmic reionization optical depth τ using, for the first time, the low-multipole EE data from HFI, reducingmore » significantly the central value and uncertainty, and hence the upper limit. Two different likelihood procedures are used to constrain τ from two estimators of the CMB E- and B-mode angular power spectra at 100 and 143 GHz, after debiasing the spectra from a small remaining systematic contamination. These all give fully consistent results. A further consistency test is performed using cross-correlations derived from the Low Frequency Instrument maps of the Planck 2015 data release and the new HFI data. For this purpose, end-to-end analyses of systematic effects from the two instruments are used to demonstrate the near independence of their dominant systematic error residuals. The tightest result comes from the HFI-based τ posterior distribution using the maximum likelihood power spectrum estimator from EE data only, giving a value 0.055 ± 0.009. Finally, in a companion paper these results are discussed in the context of the best-fit PlanckΛCDM cosmological model and recent models of reionization.« less
Planck intermediate results. XLVI. Reduction of large-scale systematic effects in HFI polarization maps and estimation of the reionization optical depth

NASA Astrophysics Data System (ADS)

Planck Collaboration; Aghanim, N.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Ballardini, M.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Basak, S.; Battye, R.; Benabed, K.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Carron, J.; Challinor, A.; Chiang, H. C.; Colombo, L. P. L.; Combet, C.; Comis, B.; Coulais, A.; Crill, B. P.; Curto, A.; Cuttaia, F.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Delouis, J.-M.; Di Valentino, E.; Dickinson, C.; Diego, J. M.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Falgarone, E.; Fantaye, Y.; Finelli, F.; Forastieri, F.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frolov, A.; Galeotta, S.; Galli, S.; Ganga, K.; Génova-Santos, R. T.; Gerbino, M.; Ghosh, T.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Helou, G.; Henrot-Versillé, S.; Herranz, D.; Hivon, E.; Huang, Z.; Ilić, S.; Jaffe, A. H.; Jones, W. C.; Keihänen, E.; Keskitalo, R.; Kisner, T. S.; Knox, L.; Krachmalnicoff, N.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lamarre, J.-M.; Langer, M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Le Jeune, M.; Leahy, J. P.; Levrier, F.; Liguori, M.; Lilje, P. B.; López-Caniego, M.; Ma, Y.-Z.; Macías-Pérez, J. F.; Maggio, G.; Mangilli, A.; Maris, M.; Martin, P. G.; Martínez-González, E.; Matarrese, S.; Mauri, N.; McEwen, J. D.; Meinhold, P. R.; Melchiorri, A.; Mennella, A.; Migliaccio, M.; Miville-Deschênes, M.-A.; Molinari, D.; Moneti, A.; Montier, L.; Morgante, G.; Moss, A.; Mottet, S.; Naselsky, P.; Natoli, P.; Oxborrow, C. A.; Pagano, L.; Paoletti, D.; Partridge, B.; Patanchon, G.; Patrizii, L.; Perdereau, O.; Perotto, L.; Pettorino, V.; Piacentini, F.; Plaszczynski, S.; Polastri, L.; Polenta, G.; Puget, J.-L.; Rachen, J. P.; Racine, B.; Reinecke, M.; Remazeilles, M.; Renzi, A.; Rocha, G.; Rossetti, M.; Roudier, G.; Rubiño-Martín, J. A.; Ruiz-Granados, B.; Salvati, L.; Sandri, M.; Savelainen, M.; Scott, D.; Sirri, G.; Sunyaev, R.; Suur-Uski, A.-S.; Tauber, J. A.; Tenti, M.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Trombetti, T.; Valiviita, J.; Van Tent, F.; Vibert, L.; Vielva, P.; Villa, F.; Vittorio, N.; Wandelt, B. D.; Watson, R.; Wehus, I. K.; White, M.; Zacchei, A.; Zonca, A.

2016-12-01

This paper describes the identification, modelling, and removal of previously unexplained systematic effects in the polarization data of the Planck High Frequency Instrument (HFI) on large angular scales, including new mapmaking and calibration procedures, new and more complete end-to-end simulations, and a set of robust internal consistency checks on the resulting maps. These maps, at 100, 143, 217, and 353 GHz, are early versions of those that will be released in final form later in 2016. The improvements allow us to determine the cosmic reionization optical depth τ using, for the first time, the low-multipole EE data from HFI, reducing significantly the central value and uncertainty, and hence the upper limit. Two different likelihood procedures are used to constrain τ from two estimators of the CMB E- and B-mode angular power spectra at 100 and 143 GHz, after debiasing the spectra from a small remaining systematic contamination. These all give fully consistent results. A further consistency test is performed using cross-correlations derived from the Low Frequency Instrument maps of the Planck 2015 data release and the new HFI data. For this purpose, end-to-end analyses of systematic effects from the two instruments are used to demonstrate the near independence of their dominant systematic error residuals. The tightest result comes from the HFI-based τ posterior distribution using the maximum likelihood power spectrum estimator from EE data only, giving a value 0.055 ± 0.009. In a companion paper these results are discussed in the context of the best-fit PlanckΛCDM cosmological model and recent models of reionization.
A class of Box-Cox transformation models for recurrent event data.

PubMed

Sun, Liuquan; Tong, Xingwei; Zhou, Xian

2011-04-01

In this article, we propose a class of Box-Cox transformation models for recurrent event data, which includes the proportional means models as special cases. The new model offers great flexibility in formulating the effects of covariates on the mean functions of counting processes while leaving the stochastic structure completely unspecified. For the inference on the proposed models, we apply a profile pseudo-partial likelihood method to estimate the model parameters via estimating equation approaches and establish large sample properties of the estimators and examine its performance in moderate-sized samples through simulation studies. In addition, some graphical and numerical procedures are presented for model checking. An example of application on a set of multiple-infection data taken from a clinic study on chronic granulomatous disease (CGD) is also illustrated.
An analysis and demonstration of clock synchronization by VLBI. [Very Long Baseline Interferometry for Deep Space Net

NASA Technical Reports Server (NTRS)

Hurd, W. J.

1974-01-01

A prototype of a semi-real time system for synchronizing the Deep Space Net station clocks by radio interferometry was successfully demonstrated on August 30, 1972. The system utilized an approximate maximum likelihood estimation procedure for processing the data, thereby achieving essentially optimum time sync estimates for a given amount of data, or equivalently, minimizing the amount of data required for reliable estimation. Synchronization accuracies as good as 100 ns rms were achieved between Deep Space Stations 11 and 12, both at Goldstone, Calif. The accuracy can be improved by increasing the system bandwidth until the fundamental limitations due to baseline and source position uncertainties and atmospheric effects are reached. These limitations are under 10 ns for transcontinental baselines.
Cleanroom certification model

NASA Technical Reports Server (NTRS)

Currit, P. A.

1983-01-01

The Cleanroom software development methodology is designed to take the gamble out of product releases for both suppliers and receivers of the software. The ingredients of this procedure are a life cycle of executable product increments, representative statistical testing, and a standard estimate of the MTTF (Mean Time To Failure) of the product at the time of its release. A statistical approach to software product testing using randomly selected samples of test cases is considered. A statistical model is defined for the certification process which uses the timing data recorded during test. A reasonableness argument for this model is provided that uses previously published data on software product execution. Also included is a derivation of the certification model estimators and a comparison of the proposed least squares technique with the more commonly used maximum likelihood estimators.
Maximum likelihood solution for inclination-only data in paleomagnetism

NASA Astrophysics Data System (ADS)

Arason, P.; Levi, S.

2010-08-01

We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
Regression analysis of mixed recurrent-event and panel-count data.

PubMed

Zhu, Liang; Tong, Xinwei; Sun, Jianguo; Chen, Manhua; Srivastava, Deo Kumar; Leisenring, Wendy; Robison, Leslie L

2014-07-01

In event history studies concerning recurrent events, two types of data have been extensively discussed. One is recurrent-event data (Cook and Lawless, 2007. The Analysis of Recurrent Event Data. New York: Springer), and the other is panel-count data (Zhao and others, 2010. Nonparametric inference based on panel-count data. Test 20: , 1-42). In the former case, all study subjects are monitored continuously; thus, complete information is available for the underlying recurrent-event processes of interest. In the latter case, study subjects are monitored periodically; thus, only incomplete information is available for the processes of interest. In reality, however, a third type of data could occur in which some study subjects are monitored continuously, but others are monitored periodically. When this occurs, we have mixed recurrent-event and panel-count data. This paper discusses regression analysis of such mixed data and presents two estimation procedures for the problem. One is a maximum likelihood estimation procedure, and the other is an estimating equation procedure. The asymptotic properties of both resulting estimators of regression parameters are established. Also, the methods are applied to a set of mixed recurrent-event and panel-count data that arose from a Childhood Cancer Survivor Study and motivated this investigation. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Estimating parameter of Rayleigh distribution by using Maximum Likelihood method and Bayes method

NASA Astrophysics Data System (ADS)

Ardianti, Fitri; Sutarman

2018-01-01

In this paper, we use Maximum Likelihood estimation and Bayes method under some risk function to estimate parameter of Rayleigh distribution to know the best method. The prior knowledge which used in Bayes method is Jeffrey’s non-informative prior. Maximum likelihood estimation and Bayes method under precautionary loss function, entropy loss function, loss function-L 1 will be compared. We compare these methods by bias and MSE value using R program. After that, the result will be displayed in tables to facilitate the comparisons.
Regression analysis of case K interval-censored failure time data in the presence of informative censoring.

PubMed

Wang, Peijie; Zhao, Hui; Sun, Jianguo

2016-12-01

Interval-censored failure time data occur in many fields such as demography, economics, medical research, and reliability and many inference procedures on them have been developed (Sun, 2006; Chen, Sun, and Peace, 2012). However, most of the existing approaches assume that the mechanism that yields interval censoring is independent of the failure time of interest and it is clear that this may not be true in practice (Zhang et al., 2007; Ma, Hu, and Sun, 2015). In this article, we consider regression analysis of case K interval-censored failure time data when the censoring mechanism may be related to the failure time of interest. For the problem, an estimated sieve maximum-likelihood approach is proposed for the data arising from the proportional hazards frailty model and for estimation, a two-step procedure is presented. In the addition, the asymptotic properties of the proposed estimators of regression parameters are established and an extensive simulation study suggests that the method works well. Finally, we apply the method to a set of real interval-censored data that motivated this study. © 2016, The International Biometric Society.
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data

PubMed Central

Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu

2012-01-01

SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Regression estimators for generic health-related quality of life and quality-adjusted life years.

PubMed

Basu, Anirban; Manca, Andrea

2012-01-01

To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.

High-Performance Clock Synchronization Algorithms for Distributed Wireless Airborne Computer Networks with Applications to Localization and Tracking of Targets

DTIC Science & Technology

2010-06-01

GMKPF represents a better and more flexible alternative to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ...accurate results relative to GML and EML when the network delays are modeled in terms of a single non-Gaussian/non-exponential distribution or as a...to the Gaussian Maximum Likelihood (GML), and Exponential Maximum Likelihood ( EML ) estimators for clock offset estimation in non-Gaussian or non
Maximum-likelihood estimation of parameterized wavefronts from multifocal data

PubMed Central

Sakamoto, Julia A.; Barrett, Harrison H.

2012-01-01

A method for determining the pupil phase distribution of an optical system is demonstrated. Coefficients in a wavefront expansion were estimated using likelihood methods, where the data consisted of multiple irradiance patterns near focus. Proof-of-principle results were obtained in both simulation and experiment. Large-aberration wavefronts were handled in the numerical study. Experimentally, we discuss the handling of nuisance parameters. Fisher information matrices, Cramér-Rao bounds, and likelihood surfaces are examined. ML estimates were obtained by simulated annealing to deal with numerous local extrema in the likelihood function. Rapid processing techniques were employed to reduce the computational time. PMID:22772282
Adaptive pre-specification in randomized trials with and without pair-matching.

PubMed

Balzer, Laura B; van der Laan, Mark J; Petersen, Maya L

2016-11-10

In randomized trials, adjustment for measured covariates during the analysis can reduce variance and increase power. To avoid misleading inference, the analysis plan must be pre-specified. However, it is often unclear a priori which baseline covariates (if any) should be adjusted for in the analysis. Consider, for example, the Sustainable East Africa Research in Community Health (SEARCH) trial for HIV prevention and treatment. There are 16 matched pairs of communities and many potential adjustment variables, including region, HIV prevalence, male circumcision coverage, and measures of community-level viral load. In this paper, we propose a rigorous procedure to data-adaptively select the adjustment set, which maximizes the efficiency of the analysis. Specifically, we use cross-validation to select from a pre-specified library the candidate targeted maximum likelihood estimator (TMLE) that minimizes the estimated variance. For further gains in precision, we also propose a collaborative procedure for estimating the known exposure mechanism. Our small sample simulations demonstrate the promise of the methodology to maximize study power, while maintaining nominal confidence interval coverage. We show how our procedure can be tailored to the scientific question (intervention effect for the study sample vs. for the target population) and study design (pair-matched or not). Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Blinded and unblinded internal pilot study designs for clinical trials with count data.

PubMed

Schneider, Simon; Schmidli, Heinz; Friede, Tim

2013-07-01

Internal pilot studies are a popular design feature to address uncertainties in the sample size calculations caused by vague information on nuisance parameters. Despite their popularity, only very recently blinded sample size reestimation procedures for trials with count data were proposed and their properties systematically investigated. Although blinded procedures are favored by regulatory authorities, practical application is somewhat limited by fears that blinded procedures are prone to bias if the treatment effect was misspecified in the planning. Here, we compare unblinded and blinded procedures with respect to bias, error rates, and sample size distribution. We find that both procedures maintain the desired power and that the unblinded procedure is slightly liberal whereas the actual significance level of the blinded procedure is close to the nominal level. Furthermore, we show that in situations where uncertainty about the assumed treatment effect exists, the blinded estimator of the control event rate is biased in contrast to the unblinded estimator, which results in differences in mean sample sizes in favor of the unblinded procedure. However, these differences are rather small compared to the deviations of the mean sample sizes from the sample size required to detect the true, but unknown effect. We demonstrate that the variation of the sample size resulting from the blinded procedure is in many practically relevant situations considerably smaller than the one of the unblinded procedures. The methods are extended to overdispersed counts using a quasi-likelihood approach and are illustrated by trials in relapsing multiple sclerosis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A comparison of abundance estimates from extended batch-marking and Jolly–Seber-type experiments

PubMed Central

Cowen, Laura L E; Besbeas, Panagiotis; Morgan, Byron J T; Schwarz, Carl J

2014-01-01

Little attention has been paid to the use of multi-sample batch-marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. (2010) present a pseudo-likelihood for a multi-sample batch-marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson-type estimator. We have developed and maximized the likelihood for batch-marking studies. We use data simulated from a Jolly–Seber-type study and convert this to what would have been obtained from an extended batch-marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch-marking model to determine the efficiency of collecting and analyzing batch-marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo-likelihood method of Huggins et al. (2010). When faced with designing a batch-marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size. PMID:24558576
A step-up test procedure to find the minimum effective dose.

PubMed

Wang, Weizhen; Peng, Jianan

2015-01-01

It is of great interest to find the minimum effective dose (MED) in dose-response studies. A sequence of decreasing null hypotheses to find the MED is formulated under the assumption of nondecreasing dose response means. A step-up multiple test procedure that controls the familywise error rate (FWER) is constructed based on the maximum likelihood estimators for the monotone normal means. When the MED is equal to one, the proposed test is uniformly more powerful than Hsu and Berger's test (1999). Also, a simulation study shows a substantial power improvement for the proposed test over four competitors. Three R-codes are provided in Supplemental Materials for this article. Go to the publishers online edition of Journal of Biopharmaceutical Statistics to view the files.
Profile-Likelihood Approach for Estimating Generalized Linear Mixed Models with Factor Structures

ERIC Educational Resources Information Center

Jeon, Minjeong; Rabe-Hesketh, Sophia

2012-01-01

In this article, the authors suggest a profile-likelihood approach for estimating complex models by maximum likelihood (ML) using standard software and minimal programming. The method works whenever setting some of the parameters of the model to known constants turns the model into a standard model. An important class of models that can be…
Multiple-Hit Parameter Estimation in Monolithic Detectors

PubMed Central

Barrett, Harrison H.; Lewellen, Tom K.; Miyaoka, Robert S.

2014-01-01

We examine a maximum-a-posteriori method for estimating the primary interaction position of gamma rays with multiple interaction sites (hits) in a monolithic detector. In assessing the performance of a multiple-hit estimator over that of a conventional one-hit estimator, we consider a few different detector and readout configurations of a 50-mm-wide square cerium-doped lutetium oxyorthosilicate block. For this study, we use simulated data from SCOUT, a Monte-Carlo tool for photon tracking and modeling scintillation- camera output. With this tool, we determine estimate bias and variance for a multiple-hit estimator and compare these with similar metrics for a one-hit maximum-likelihood estimator, which assumes full energy deposition in one hit. We also examine the effect of event filtering on these metrics; for this purpose, we use a likelihood threshold to reject signals that are not likely to have been produced under the assumed likelihood model. Depending on detector design, we observe a 1%–12% improvement of intrinsic resolution for a 1-or-2-hit estimator as compared with a 1-hit estimator. We also observe improved differentiation of photopeak events using a 1-or-2-hit estimator as compared with the 1-hit estimator; more than 6% of photopeak events that were rejected by likelihood filtering for the 1-hit estimator were accurately identified as photopeak events and positioned without loss of resolution by a 1-or-2-hit estimator; for PET, this equates to at least a 12% improvement in coincidence-detection efficiency with likelihood filtering applied. PMID:23193231
On-Orbit Multi-Field Wavefront Control with a Kalman Filter

NASA Technical Reports Server (NTRS)

Lou, John; Sigrist, Norbert; Basinger, Scott; Redding, David

2008-01-01

A document describes a multi-field wavefront control (WFC) procedure for the James Webb Space Telescope (JWST) on-orbit optical telescope element (OTE) fine-phasing using wavefront measurements at the NIRCam pupil. The control is applied to JWST primary mirror (PM) segments and secondary mirror (SM) simultaneously with a carefully selected ordering. Through computer simulations, the multi-field WFC procedure shows that it can reduce the initial system wavefront error (WFE), as caused by random initial system misalignments within the JWST fine-phasing error budget, from a few dozen micrometers to below 50 nm across the entire NIRCam Field of View, and the WFC procedure is also computationally stable as the Monte-Carlo simulations indicate. With the incorporation of a Kalman Filter (KF) as an optical state estimator into the WFC process, the robustness of the JWST OTE alignment process can be further improved. In the presence of some large optical misalignments, the Kalman state estimator can provide a reasonable estimate of the optical state, especially for those degrees of freedom that have a significant impact on the system WFE. The state estimate allows for a few corrections to the optical state to push the system towards its nominal state, and the result is that a large part of the WFE can be eliminated in this step. When the multi-field WFC procedure is applied after Kalman state estimate and correction, the stability of fine-phasing control is much more certain. Kalman Filter has been successfully applied to diverse applications as a robust and optimal state estimator. In the context of space-based optical system alignment based on wavefront measurements, a KF state estimator can combine all available wavefront measurements, past and present, as well as measurement and actuation error statistics to generate a Maximum-Likelihood optimal state estimator. The strength and flexibility of the KF algorithm make it attractive for use in real-time optical system alignment when WFC alone cannot effectively align the system.
Unified framework to evaluate panmixia and migration direction among multiple sampling locations.

PubMed

Beerli, Peter; Palczewski, Michal

2010-05-01

For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
Automated model selection in covariance estimation and spatial whitening of MEG and EEG signals.

PubMed

Engemann, Denis A; Gramfort, Alexandre

2015-03-01

Magnetoencephalography and electroencephalography (M/EEG) measure non-invasively the weak electromagnetic fields induced by post-synaptic neural currents. The estimation of the spatial covariance of the signals recorded on M/EEG sensors is a building block of modern data analysis pipelines. Such covariance estimates are used in brain-computer interfaces (BCI) systems, in nearly all source localization methods for spatial whitening as well as for data covariance estimation in beamformers. The rationale for such models is that the signals can be modeled by a zero mean Gaussian distribution. While maximizing the Gaussian likelihood seems natural, it leads to a covariance estimate known as empirical covariance (EC). It turns out that the EC is a poor estimate of the true covariance when the number of samples is small. To address this issue the estimation needs to be regularized. The most common approach downweights off-diagonal coefficients, while more advanced regularization methods are based on shrinkage techniques or generative models with low rank assumptions: probabilistic PCA (PPCA) and factor analysis (FA). Using cross-validation all of these models can be tuned and compared based on Gaussian likelihood computed on unseen data. We investigated these models on simulations, one electroencephalography (EEG) dataset as well as magnetoencephalography (MEG) datasets from the most common MEG systems. First, our results demonstrate that different models can be the best, depending on the number of samples, heterogeneity of sensor types and noise properties. Second, we show that the models tuned by cross-validation are superior to models with hand-selected regularization. Hence, we propose an automated solution to the often overlooked problem of covariance estimation of M/EEG signals. The relevance of the procedure is demonstrated here for spatial whitening and source localization of MEG signals. Copyright © 2015 Elsevier Inc. All rights reserved.
Profile-likelihood Confidence Intervals in Item Response Theory Models.

PubMed

Chalmers, R Philip; Pek, Jolynn; Liu, Yang

2017-01-01

Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
Expected versus Observed Information in SEM with Incomplete Normal and Nonnormal Data

ERIC Educational Resources Information Center

Savalei, Victoria

2010-01-01

Maximum likelihood is the most common estimation method in structural equation modeling. Standard errors for maximum likelihood estimates are obtained from the associated information matrix, which can be estimated from the sample using either expected or observed information. It is known that, with complete data, estimates based on observed or…
Robust inference in the negative binomial regression model with an application to falls data.

PubMed

Aeberhard, William H; Cantoni, Eva; Heritier, Stephane

2014-12-01

A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.
Experimental Design for Parameter Estimation of Gene Regulatory Networks

PubMed Central

Timmer, Jens

2012-01-01

Systems biology aims for building quantitative models to address unresolved issues in molecular biology. In order to describe the behavior of biological cells adequately, gene regulatory networks (GRNs) are intensively investigated. As the validity of models built for GRNs depends crucially on the kinetic rates, various methods have been developed to estimate these parameters from experimental data. For this purpose, it is favorable to choose the experimental conditions yielding maximal information. However, existing experimental design principles often rely on unfulfilled mathematical assumptions or become computationally demanding with growing model complexity. To solve this problem, we combined advanced methods for parameter and uncertainty estimation with experimental design considerations. As a showcase, we optimized three simulated GRNs in one of the challenges from the Dialogue for Reverse Engineering Assessment and Methods (DREAM). This article presents our approach, which was awarded the best performing procedure at the DREAM6 Estimation of Model Parameters challenge. For fast and reliable parameter estimation, local deterministic optimization of the likelihood was applied. We analyzed identifiability and precision of the estimates by calculating the profile likelihood. Furthermore, the profiles provided a way to uncover a selection of most informative experiments, from which the optimal one was chosen using additional criteria at every step of the design process. In conclusion, we provide a strategy for optimal experimental design and show its successful application on three highly nonlinear dynamic models. Although presented in the context of the GRNs to be inferred for the DREAM6 challenge, the approach is generic and applicable to most types of quantitative models in systems biology and other disciplines. PMID:22815723
Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown

ERIC Educational Resources Information Center

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi

2014-01-01

When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…
Mortality table construction

NASA Astrophysics Data System (ADS)

Sutawanir

2015-12-01

Mortality tables play important role in actuarial studies such as life annuities, premium determination, premium reserve, valuation pension plan, pension funding. Some known mortality tables are CSO mortality table, Indonesian Mortality Table, Bowers mortality table, Japan Mortality table. For actuary applications some tables are constructed with different environment such as single decrement, double decrement, and multiple decrement. There exist two approaches in mortality table construction : mathematics approach and statistical approach. Distribution model and estimation theory are the statistical concepts that are used in mortality table construction. This article aims to discuss the statistical approach in mortality table construction. The distributional assumptions are uniform death distribution (UDD) and constant force (exponential). Moment estimation and maximum likelihood are used to estimate the mortality parameter. Moment estimation methods are easier to manipulate compared to maximum likelihood estimation (mle). However, the complete mortality data are not used in moment estimation method. Maximum likelihood exploited all available information in mortality estimation. Some mle equations are complicated and solved using numerical methods. The article focus on single decrement estimation using moment and maximum likelihood estimation. Some extension to double decrement will introduced. Simple dataset will be used to illustrated the mortality estimation, and mortality table.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

PubMed

Ma, Yan; Mazumdar, Madhu

2011-10-30

Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
The Maximum Likelihood Estimation of Signature Transformation /MLEST/ algorithm. [for affine transformation of crop inventory data

NASA Technical Reports Server (NTRS)

Thadani, S. G.

1977-01-01

The Maximum Likelihood Estimation of Signature Transformation (MLEST) algorithm is used to obtain maximum likelihood estimates (MLE) of affine transformation. The algorithm has been evaluated for three sets of data: simulated (training and recognition segment pairs), consecutive-day (data gathered from Landsat images), and geographical-extension (large-area crop inventory experiment) data sets. For each set, MLEST signature extension runs were made to determine MLE values and the affine-transformed training segment signatures were used to classify the recognition segments. The classification results were used to estimate wheat proportions at 0 and 1% threshold values.
Asymptotic Properties of Induced Maximum Likelihood Estimates of Nonlinear Models for Item Response Variables: The Finite-Generic-Item-Pool Case.

ERIC Educational Resources Information Center

Jones, Douglas H.

The progress of modern mental test theory depends very much on the techniques of maximum likelihood estimation, and many popular applications make use of likelihoods induced by logistic item response models. While, in reality, item responses are nonreplicate within a single examinee and the logistic models are only ideal, practitioners make…

Computation of nonparametric convex hazard estimators via profile methods.

PubMed

Jankowski, Hanna K; Wellner, Jon A

2009-05-01

This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females.
Applying a Weighted Maximum Likelihood Latent Trait Estimator to the Generalized Partial Credit Model

ERIC Educational Resources Information Center

Penfield, Randall D.; Bergeron, Jennifer M.

2005-01-01

This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
Rate of convergence of k-step Newton estimators to efficient likelihood estimators

Treesearch

Steve Verrill

2007-01-01

We make use of Cramer conditions together with the well-known local quadratic convergence of Newton?s method to establish the asymptotic closeness of k-step Newton estimators to efficient likelihood estimators. In Verrill and Johnson [2007. Confidence bounds and hypothesis tests for normal distribution coefficients of variation. USDA Forest Products Laboratory Research...
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis: A Comparison of Maximum Likelihood and Bayesian Estimations.

PubMed

Can, Seda; van de Schoot, Rens; Hox, Joop

2015-06-01

Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions.
Fast maximum likelihood estimation using continuous-time neural point process models.

PubMed

Lepage, Kyle Q; MacDonald, Christopher J

2015-06-01

A recent report estimates that the number of simultaneously recorded neurons is growing exponentially. A commonly employed statistical paradigm using discrete-time point process models of neural activity involves the computation of a maximum-likelihood estimate. The time to computate this estimate, per neuron, is proportional to the number of bins in a finely spaced discretization of time. By using continuous-time models of neural activity and the optimally efficient Gaussian quadrature, memory requirements and computation times are dramatically decreased in the commonly encountered situation where the number of parameters p is much less than the number of time-bins n. In this regime, with q equal to the quadrature order, memory requirements are decreased from O(np) to O(qp), and the number of floating-point operations are decreased from O(np(2)) to O(qp(2)). Accuracy of the proposed estimates is assessed based upon physiological consideration, error bounds, and mathematical results describing the relation between numerical integration error and numerical error affecting both parameter estimates and the observed Fisher information. A check is provided which is used to adapt the order of numerical integration. The procedure is verified in simulation and for hippocampal recordings. It is found that in 95 % of hippocampal recordings a q of 60 yields numerical error negligible with respect to parameter estimate standard error. Statistical inference using the proposed methodology is a fast and convenient alternative to statistical inference performed using a discrete-time point process model of neural activity. It enables the employment of the statistical methodology available with discrete-time inference, but is faster, uses less memory, and avoids any error due to discretization.
Inference for finite-sample trajectories in dynamic multi-state site-occupancy models using hidden Markov model smoothing

USGS Publications Warehouse

Fiske, Ian J.; Royle, J. Andrew; Gross, Kevin

2014-01-01

Ecologists and wildlife biologists increasingly use latent variable models to study patterns of species occurrence when detection is imperfect. These models have recently been generalized to accommodate both a more expansive description of state than simple presence or absence, and Markovian dynamics in the latent state over successive sampling seasons. In this paper, we write these multi-season, multi-state models as hidden Markov models to find both maximum likelihood estimates of model parameters and finite-sample estimators of the trajectory of the latent state over time. These estimators are especially useful for characterizing population trends in species of conservation concern. We also develop parametric bootstrap procedures that allow formal inference about latent trend. We examine model behavior through simulation, and we apply the model to data from the North American Amphibian Monitoring Program.
IRT Item Parameter Recovery with Marginal Maximum Likelihood Estimation Using Loglinear Smoothing Models

ERIC Educational Resources Information Center

Casabianca, Jodi M.; Lewis, Charles

2015-01-01

Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
An EM Algorithm for Maximum Likelihood Estimation of Process Factor Analysis Models

ERIC Educational Resources Information Center

Lee, Taehun

2010-01-01

In this dissertation, an Expectation-Maximization (EM) algorithm is developed and implemented to obtain maximum likelihood estimates of the parameters and the associated standard error estimates characterizing temporal flows for the latent variable time series following stationary vector ARMA processes, as well as the parameters defining the…
Maximum Likelihood Estimation of Spectra Information from Multiple Independent Astrophysics Data Sets

NASA Technical Reports Server (NTRS)

Howell, Leonard W., Jr.; Six, N. Frank (Technical Monitor)

2002-01-01

The Maximum Likelihood (ML) statistical theory required to estimate spectra information from an arbitrary number of astrophysics data sets produced by vastly different science instruments is developed in this paper. This theory and its successful implementation will facilitate the interpretation of spectral information from multiple astrophysics missions and thereby permit the derivation of superior spectral information based on the combination of data sets. The procedure is of significant value to both existing data sets and those to be produced by future astrophysics missions consisting of two or more detectors by allowing instrument developers to optimize each detector's design parameters through simulation studies in order to design and build complementary detectors that will maximize the precision with which the science objectives may be obtained. The benefits of this ML theory and its application is measured in terms of the reduction of the statistical errors (standard deviations) of the spectra information using the multiple data sets in concert as compared to the statistical errors of the spectra information when the data sets are considered separately, as well as any biases resulting from poor statistics in one or more of the individual data sets that might be reduced when the data sets are combined.
ASA grade and Charlson Comorbidity Index of spinal surgery patients: correlation with complications and societal costs.

PubMed

Whitmore, Robert G; Stephen, James H; Vernick, Coleen; Campbell, Peter G; Yadla, Sanjay; Ghobrial, George M; Maltenfort, Mitchell G; Ratliff, John K

2014-01-01

The Charlson Comorbidity Index (CCI) and the American Society of Anesthesiologists (ASA) Physical Status Classification System (ASA grade) are useful for predicting morbidity and mortality for a variety of disease processes. To evaluate CCI and ASA grade as predictors of complications after spinal surgery and examine the correlation between these comorbidity indices and the cost of care. Prospective observational study. All patients undergoing any spine surgery at a single academic tertiary center over a 6-month period. Direct health-care costs estimated from diagnosis related group and Current Procedural Terminology (CPT) codes. Demographic data, including all patient comorbidities, procedural data, and all complications, occurring within 30 days of the index procedure were prospectively recorded. Charlson Comorbidity Index was calculated from International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes and ASA grades determined from the operative record. Diagnosis related group and CPT codes were captured for each patient. Direct costs were estimated from a societal perspective using Medicare rates of reimbursement. A multivariable analysis was performed to assess the association of the CCI and ASA grade to the rate of complication and direct health-care costs. Two hundred twenty-six cases were analyzed. The average CCI score for the patient cohort was 0.92, and the average ASA grade was 2.65. The CCI and ASA grade were significantly correlated, with Spearman ρ of 0.458 (p<.001). Both CCI and ASA grade were associated with increasing body mass index (p<.01) and increasing patient age (p<.0001). Increasing CCI was associated with an increasing likelihood of occurrence of any complication (p=.0093) and of minor complications (p=.0032). Increasing ASA grade was significantly associated with an increasing likelihood of occurrence of a major complication (p=.0035). Increasing ASA grade showed a significant association with increasing direct costs (p=.0062). American Society of Anesthesiologists and CCI scores are useful comorbidity indices for the spine patient population, although neither was completely predictive of complication occurrence. A spine-specific comorbidity index, based on ICD-9-CM coding that could be easily captured from patient records, and which is predictive of patient likelihood of complications and mortality, would be beneficial in patient counseling and choice of operative intervention. Copyright © 2014 Elsevier Inc. All rights reserved.
Flight-determined correction terms for angle of attack and sideslip

NASA Technical Reports Server (NTRS)

Shafer, M. F.

1982-01-01

The effects of local flow, upwash, and sidewash on angle of attack and sideslip (measured with boom-mounted vanes) were determined for subsonic, transonic, and supersonic flight using a maximum likelihood estimator. The correction terms accounting for these effects were determined using a series of maneuvers flown at a large number of flight conditions in both augmented and unaugmented control modes. The correction terms provide improved angle-of-attack and sideslip values for use in the estimation of stability and control derivatives. In addition to detailing the procedure used to determine these correction terms, this paper discusses various effects, such as those related to Mach number, on the correction terms. The use of maneuvers flown in augmented and unaugmented control modes is also discussed.
The use of auxiliary variables in capture-recapture and removal experiments

USGS Publications Warehouse

Pollock, K.H.; Hines, J.E.; Nichols, J.D.

1984-01-01

The dependence of animal capture probabilities on auxiliary variables is an important practical problem which has not been considered in the development of estimation procedures for capture-recapture and removal experiments. In this paper the linear logistic binary regression model is used to relate the probability of capture to continuous auxiliary variables. The auxiliary variables could be environmental quantities such as air or water temperature, or characteristics of individual animals, such as body length or weight. Maximum likelihood estimators of the population parameters are considered for a variety of models which all assume a closed population. Testing between models is also considered. The models can also be used when one auxiliary variable is a measure of the effort expended in obtaining the sample.
Multiple robustness in factorized likelihood models.

PubMed

Molina, J; Rotnitzky, A; Sued, M; Robins, J M

2017-09-01

We consider inference under a nonparametric or semiparametric model with likelihood that factorizes as the product of two or more variation-independent factors. We are interested in a finite-dimensional parameter that depends on only one of the likelihood factors and whose estimation requires the auxiliary estimation of one or several nuisance functions. We investigate general structures conducive to the construction of so-called multiply robust estimating functions, whose computation requires postulating several dimension-reducing models but which have mean zero at the true parameter value provided one of these models is correct.
Recovery of Item Parameters in the Nominal Response Model: A Comparison of Marginal Maximum Likelihood Estimation and Markov Chain Monte Carlo Estimation.

ERIC Educational Resources Information Center

Wollack, James A.; Bolt, Daniel M.; Cohen, Allan S.; Lee, Young-Sun

2002-01-01

Compared the quality of item parameter estimates for marginal maximum likelihood (MML) and Markov Chain Monte Carlo (MCMC) with the nominal response model using simulation. The quality of item parameter recovery was nearly identical for MML and MCMC, and both methods tended to produce good estimates. (SLD)
Phylogenetic evidence for cladogenetic polyploidization in land plants.

PubMed

Zhan, Shing H; Drori, Michal; Goldberg, Emma E; Otto, Sarah P; Mayrose, Itay

2016-07-01

Polyploidization is a common and recurring phenomenon in plants and is often thought to be a mechanism of "instant speciation". Whether polyploidization is associated with the formation of new species (cladogenesis) or simply occurs over time within a lineage (anagenesis), however, has never been assessed systematically. We tested this hypothesis using phylogenetic and karyotypic information from 235 plant genera (mostly angiosperms). We first constructed a large database of combined sequence and chromosome number data sets using an automated procedure. We then applied likelihood models (ClaSSE) that estimate the degree of synchronization between polyploidization and speciation events in maximum likelihood and Bayesian frameworks. Our maximum likelihood analysis indicated that 35 genera supported a model that includes cladogenetic transitions over a model with only anagenetic transitions, whereas three genera supported a model that incorporates anagenetic transitions over one with only cladogenetic transitions. Furthermore, the Bayesian analysis supported a preponderance of cladogenetic change in four genera but did not support a preponderance of anagenetic change in any genus. Overall, these phylogenetic analyses provide the first broad confirmation that polyploidization is temporally associated with speciation events, suggesting that it is indeed a major speciation mechanism in plants, at least in some genera. © 2016 Botanical Society of America.
Clinical metric and medication persistency effects: evidence from a Medicaid care management program.

PubMed

Berg, Gregory D; Leary, Fredric; Medina, Wendie; Donnelly, Shawn; Warnick, Kathleen

2015-02-01

The objective was to estimate clinical metric and medication persistency impacts of a care management program. The data sources were Medicaid administrative claims for a sample population of 32,334 noninstitutionalized Medicaid-only aged, blind, or disabled patients with diagnosed conditions of asthma, coronary artery disease, chronic obstructive pulmonary disease, diabetes, or heart failure between 2005 and 2009. Multivariate regression analysis was used to test the hypothesis that exposure to a care management intervention increased the likelihood of having the appropriate medication or procedures performed, as well as increased medication persistency. Statistically significant clinical metric improvements occurred in each of the 5 conditions studied. Increased medication persistency was found for beta-blocker medication for members with coronary artery disease, angiotensin-converting enzyme inhibitor/angiotensin receptor blocker and diuretic medications for members with heart failure, bronchodilator and corticosteroid medications for members with chronic obstructive pulmonary disease, and aspirin/antiplatelet medications for members with diabetes. This study demonstrates that a care management program increases the likelihood of having an appropriate medication dispensed and/or an appropriate clinical test performed, as well as increased likelihood of medication persistency, in people with chronic conditions.
Deep Unfolding for Topic Models.

PubMed

Chien, Jen-Tzung; Lee, Chao-Hsi

2018-02-01

Deep unfolding provides an approach to integrate the probabilistic generative models and the deterministic neural networks. Such an approach is benefited by deep representation, easy interpretation, flexible learning and stochastic modeling. This study develops the unsupervised and supervised learning of deep unfolded topic models for document representation and classification. Conventionally, the unsupervised and supervised topic models are inferred via the variational inference algorithm where the model parameters are estimated by maximizing the lower bound of logarithm of marginal likelihood using input documents without and with class labels, respectively. The representation capability or classification accuracy is constrained by the variational lower bound and the tied model parameters across inference procedure. This paper aims to relax these constraints by directly maximizing the end performance criterion and continuously untying the parameters in learning process via deep unfolding inference (DUI). The inference procedure is treated as the layer-wise learning in a deep neural network. The end performance is iteratively improved by using the estimated topic parameters according to the exponentiated updates. Deep learning of topic models is therefore implemented through a back-propagation procedure. Experimental results show the merits of DUI with increasing number of layers compared with variational inference in unsupervised as well as supervised topic models.
Univariate and bivariate likelihood-based meta-analysis methods performed comparably when marginal sensitivity and specificity were the targets of inference.

PubMed

Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H

2017-03-01

To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling.

PubMed

Scheike, Thomas H; Juul, Anders

2004-04-01

Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.
Adaptive OFDM Radar Waveform Design for Improved Micro-Doppler Estimation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sen, Satyabrata

Here we analyze the performance of a wideband orthogonal frequency division multiplexing (OFDM) signal in estimating the micro-Doppler frequency of a rotating target having multiple scattering centers. The use of a frequency-diverse OFDM signal enables us to independently analyze the micro-Doppler characteristics with respect to a set of orthogonal subcarrier frequencies. We characterize the accuracy of micro-Doppler frequency estimation by computing the Cramer-Rao bound (CRB) on the angular-velocity estimate of the target. Additionally, to improve the accuracy of the estimation procedure, we formulate and solve an optimization problem by minimizing the CRB on the angular-velocity estimate with respect to themore » OFDM spectral coefficients. We present several numerical examples to demonstrate the CRB variations with respect to the signal-to-noise ratios, number of temporal samples, and number of OFDM subcarriers. We also analysed numerically the improvement in estimation accuracy due to the adaptive waveform design. A grid-based maximum likelihood estimation technique is applied to evaluate the corresponding mean-squared error performance.« less

Five Methods for Estimating Angoff Cut Scores with IRT

ERIC Educational Resources Information Center

Wyse, Adam E.

2017-01-01

This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Beyond Roughness: Maximum-Likelihood Estimation of Topographic "Structure" on Venus and Elsewhere in the Solar System

NASA Astrophysics Data System (ADS)

Simons, F. J.; Eggers, G. L.; Lewis, K. W.; Olhede, S. C.

2015-12-01

What numbers "capture" topography? If stationary, white, and Gaussian: mean and variance. But "whiteness" is strong; we are led to a "baseline" over which to compute means and variances. We then have subscribed to topography as a correlated process, and to the estimation (noisy, afftected by edge effects) of the parameters of a spatial or spectral covariance function. What if the covariance function or the point process itself aren't Gaussian? What if the region under study isn't regularly shaped or sampled? How can results from differently sized patches be compared robustly? We present a spectral-domain "Whittle" maximum-likelihood procedure that circumvents these difficulties and answers the above questions. The key is the Matern form, whose parameters (variance, range, differentiability) define the shape of the covariance function (Gaussian, exponential, ..., are all special cases). We treat edge effects in simulation and in estimation. Data tapering allows for the irregular regions. We determine the estimation variance of all parameters. And the "best" estimate may not be "good enough": we test whether the "model" itself warrants rejection. We illustrate our methodology on geologically mapped patches of Venus. Surprisingly few numbers capture planetary topography. We derive them, with uncertainty bounds, we simulate "new" realizations of patches that look to the geologists exactly as if they were derived from similar processes. Our approach holds in 1, 2, and 3 spatial dimensions, and generalizes to multiple variables, e.g. when topography and gravity are being considered jointly (perhaps linked by flexural rigidity, erosion, or other surface and sub-surface modifying processes). Our results have widespread implications for the study of planetary topography in the Solar System, and are interpreted in the light of trying to derive "process" from "parameters", the end goal to assign likely formation histories for the patches under consideration. Our results should also be relevant for whomever needed to perform spatial interpolation or out-of-sample extension (e.g. kriging), machine learning and feature detection, on geological data. We present procedural details but focus on high-level results that have real-world implications for the study of Venus, Earth, other planets, and moons.
Experimental Procedures for Sensitive and Reproducible In Situ EPR Tooth Dosimetry

PubMed Central

Williams, Benjamin B.; Sucheta, Artur; Dong, Ruhong; Sakata, Yasuko; Iwasaki, Akinori; Burke, Gregory; Grinberg, Oleg; Lesniewski, Piotr; Kmiec, Maciej; Swartz, Harold M.

2007-01-01

In vivo electron paramagnetic resonance (EPR) tooth dosimetry provides a means for non-invasive retrospective assessment of personal radiation exposure. While there is a clear need for such capabilities following radiation accidents, the most pressing need for the development of this technology is the heightened likelihood of terrorist events or nuclear conflicts. This technique will enable such measurements to be made at the site of an incident, while the subject is present, to assist emergency personnel as they perform triage for the affected population. At Dartmouth Medical School this development is currently being tested with normal volunteers with irradiated teeth placed in their mouths and with patients who have undergone radiation therapy. Here we describe progress in practical procedures to provide accurate and reproducible in vivo dose estimates. PMID:18591989
Control of Risks Through the Use of Procedures: A Method for Evaluating the Change in Risk

NASA Technical Reports Server (NTRS)

Praino, Gregory T.; Sharit, Joseph

2010-01-01

This paper considers how procedures can be used to control risks faced by an organization and proposes a means of recognizing if a particular procedure reduces risk or contributes to the organization's exposure. The proposed method was developed out of the review of work documents and the governing procedures performed in the wake of the Columbia accident by NASA and the Space Shuttle prime contractor, United Space Alliance, LLC. A technique was needed to understand the rules, or procedural controls, in place at the time in the context of how important the role of each rule was. The proposed method assesses procedural risks, the residual risk associated with a hazard after a procedure's influence is accounted for, by considering each clause of a procedure as a unique procedural control that may be beneficial or harmful. For procedural risks with consequences severe enough to threaten the survival of the organization, the method measures the characteristics of each risk on a scale that is an alternative to the traditional consequence/likelihood couple. The dual benefits of the substitute scales are that they eliminate both the need to quantify a relationship between different consequence types and the need for the extensive history a probabilistic risk assessment would require. Control Value is used as an analog for the consequence, where the value of a rule is based on how well the control reduces the severity of the consequence when operating successfully. This value is composed of two parts: the inevitability of the consequence in the absence of the control, and the opportunity to intervene before the consequence is realized. High value controls will be ones where there is minimal need for intervention but maximum opportunity to actively prevent the outcome. Failure Likelihood is used as the substitute for the conventional likelihood of the outcome. For procedural controls, a failure is considered to be any non-malicious violation of the rule, whether intended or not. The model used for describing the Failure Likelihood considers how well a task was established by evaluating that task on five components. The components selected to define a well established task are: that it be defined, assigned to someone capable, that they be trained appropriately, that the actions be organized to enable proper completion and that some form of independent monitoring be performed. Validation of the method was based on the information provided by a group of experts in Space Shuttle ground processing when they were presented with 5 scenarios that identified a clause from a procedure. For each scenario, they recorded their perception of how important the associated rule was and how likely it was to fail. They then rated the components of Control Value and Failure Likelihood for all the scenarios. The order in which each reviewer ranked the scenarios Control Value and Failure Likelihood was compared to the order in which they ranked the scenarios for each of the associated components; inevitability and opportunity for Control Value and definition, assignment, training, organization and monitoring for Failure Likelihood. This order comparison showed how the components contributed to a relative relationship to the substitute risk element. With the relationship established for Space Shuttle ground processing, this method can be used to gauge if the introduction or removal of a particular rule will increase or decrease the .risk associated with the hazard it is intended to control.
Explaining the effect of event valence on unrealistic optimism.

PubMed

Gold, Ron S; Brown, Mark G

2009-05-01

People typically exhibit 'unrealistic optimism' (UO): they believe they have a lower chance of experiencing negative events and a higher chance of experiencing positive events than does the average person. UO has been found to be greater for negative than positive events. This 'valence effect' has been explained in terms of motivational processes. An alternative explanation is provided by the 'numerosity model', which views the valence effect simply as a by-product of a tendency for likelihood estimates pertaining to the average member of a group to increase with the size of the group. Predictions made by the numerosity model were tested in two studies. In each, UO for a single event was assessed. In Study 1 (n = 115 students), valence was manipulated by framing the event either negatively or positively, and participants estimated their own likelihood and that of the average student at their university. In Study 2 (n = 139 students), valence was again manipulated and participants again estimated their own likelihood; additionally, group size was manipulated by having participants estimate the likelihood of the average student in a small, medium-sized, or large group. In each study, the valence effect was found, but was due to an effect on estimates of own likelihood, not the average person's likelihood. In Study 2, valence did not interact with group size. The findings contradict the numerosity model, but are in accord with the motivational explanation. Implications for health education are discussed.
Matched samples logistic regression in case-control studies with missing values: when to break the matches.

PubMed

Hansson, Lisbeth; Khamis, Harry J

2008-12-01

Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty

PubMed Central

Baele, Guy; Lemey, Philippe; Suchard, Marc A.

2016-01-01

Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.

PubMed

Nagelkerke, Nico; Fidler, Vaclav

2015-01-01

The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
Sequential Bayesian Filters for Estimating Time Series of Wrapped and Unwrapped Angles with Hyperparameter Estimation

NASA Astrophysics Data System (ADS)

Umehara, Hiroaki; Okada, Masato; Naruse, Yasushi

2018-03-01

The estimation of angular time series data is a widespread issue relating to various situations involving rotational motion and moving objects. There are two kinds of problem settings: the estimation of wrapped angles, which are principal values in a circular coordinate system (e.g., the direction of an object), and the estimation of unwrapped angles in an unbounded coordinate system such as for the positioning and tracking of moving objects measured by the signal-wave phase. Wrapped angles have been estimated in previous studies by sequential Bayesian filtering; however, the hyperparameters that are to be solved and that control the properties of the estimation model were given a priori. The present study establishes a procedure of hyperparameter estimation from the observation data of angles only, using the framework of Bayesian inference completely as the maximum likelihood estimation. Moreover, the filter model is modified to estimate the unwrapped angles. It is proved that without noise our model reduces to the existing algorithm of Itoh's unwrapping transform. It is numerically confirmed that our model is an extension of unwrapping estimation from Itoh's unwrapping transform to the case with noise.
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference

NASA Astrophysics Data System (ADS)

Hall, Alex; Taylor, Andy

2017-06-01

We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
Identification and Inference for Econometric Models

NASA Astrophysics Data System (ADS)

Andrews, Donald W. K.; Stock, James H.

2005-07-01

This volume contains the papers presented in honor of the lifelong achievements of Thomas J. Rothenberg on the occasion of his retirement. The authors of the chapters include many of the leading econometricians of our day, and the chapters address topics of current research significance in econometric theory. The chapters cover four themes: identification and efficient estimation in econometrics, asymptotic approximations to the distributions of econometric estimators and tests, inference involving potentially nonstationary time series, such as processes that might have a unit autoregressive root, and nonparametric and semiparametric inference. Several of the chapters provide overviews and treatments of basic conceptual issues, while others advance our understanding of the properties of existing econometric procedures and/or propose new ones. Specific topics include identification in nonlinear models, inference with weak instruments, tests for nonstationary in time series and panel data, generalized empirical likelihood estimation, and the bootstrap.
Finite-size analysis of continuous-variable measurement-device-independent quantum key distribution

NASA Astrophysics Data System (ADS)

Zhang, Xueying; Zhang, Yichen; Zhao, Yijia; Wang, Xiangyu; Yu, Song; Guo, Hong

2017-10-01

We study the impact of the finite-size effect on the continuous-variable measurement-device-independent quantum key distribution (CV-MDI QKD) protocol, mainly considering the finite-size effect on the parameter estimation procedure. The central-limit theorem and maximum likelihood estimation theorem are used to estimate the parameters. We also analyze the relationship between the number of exchanged signals and the optimal modulation variance in the protocol. It is proved that when Charlie's position is close to Bob, the CV-MDI QKD protocol has the farthest transmission distance in the finite-size scenario. Finally, we discuss the impact of finite-size effects related to the practical detection in the CV-MDI QKD protocol. The overall results indicate that the finite-size effect has a great influence on the secret-key rate of the CV-MDI QKD protocol and should not be ignored.
Comparison of two non-convex mixed-integer nonlinear programming algorithms applied to autoregressive moving average model structure and parameter estimation

NASA Astrophysics Data System (ADS)

Uilhoorn, F. E.

2016-10-01

In this article, the stochastic modelling approach proposed by Box and Jenkins is treated as a mixed-integer nonlinear programming (MINLP) problem solved with a mesh adaptive direct search and a real-coded genetic class of algorithms. The aim is to estimate the real-valued parameters and non-negative integer, correlated structure of stationary autoregressive moving average (ARMA) processes. The maximum likelihood function of the stationary ARMA process is embedded in Akaike's information criterion and the Bayesian information criterion, whereas the estimation procedure is based on Kalman filter recursions. The constraints imposed on the objective function enforce stability and invertibility. The best ARMA model is regarded as the global minimum of the non-convex MINLP problem. The robustness and computational performance of the MINLP solvers are compared with brute-force enumeration. Numerical experiments are done for existing time series and one new data set.
A Comparison of Pseudo-Maximum Likelihood and Asymptotically Distribution-Free Dynamic Factor Analysis Parameter Estimation in Fitting Covariance Structure Models to Block-Toeplitz Matrices Representing Single-Subject Multivariate Time-Series.

ERIC Educational Resources Information Center

Molenaar, Peter C. M.; Nesselroade, John R.

1998-01-01

Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
The CLASSY clustering algorithm: Description, evaluation, and comparison with the iterative self-organizing clustering system (ISOCLS). [used for LACIE data

NASA Technical Reports Server (NTRS)

Lennington, R. K.; Malek, H.

1978-01-01

A clustering method, CLASSY, was developed, which alternates maximum likelihood iteration with a procedure for splitting, combining, and eliminating the resulting statistics. The method maximizes the fit of a mixture of normal distributions to the observed first through fourth central moments of the data and produces an estimate of the proportions, means, and covariances in this mixture. The mathematical model which is the basic for CLASSY and the actual operation of the algorithm is described. Data comparing the performances of CLASSY and ISOCLS on simulated and actual LACIE data are presented.
Cosmological parameters from a re-analysis of the WMAP 7 year low-resolution maps

NASA Astrophysics Data System (ADS)

Finelli, F.; De Rosa, A.; Gruppuso, A.; Paoletti, D.

2013-06-01

Cosmological parameters from Wilkinson Microwave Anisotropy Probe (WMAP) 7 year data are re-analysed by substituting a pixel-based likelihood estimator to the one delivered publicly by the WMAP team. Our pixel-based estimator handles exactly intensity and polarization in a joint manner, allowing us to use low-resolution maps and noise covariance matrices in T, Q, U at the same resolution, which in this work is 3.6°. We describe the features and the performances of the code implementing our pixel-based likelihood estimator. We perform a battery of tests on the application of our pixel-based likelihood routine to WMAP publicly available low-resolution foreground-cleaned products, in combination with the WMAP high-ℓ likelihood, reporting the differences on cosmological parameters evaluated by the full WMAP likelihood public package. The differences are not only due to the treatment of polarization, but also to the marginalization over monopole and dipole uncertainties present in the WMAP pixel likelihood code for temperature. The credible central value for the cosmological parameters change below the 1σ level with respect to the evaluation by the full WMAP 7 year likelihood code, with the largest difference in a shift to smaller values of the scalar spectral index nS.
Modeling of 2D diffusion processes based on microscopy data: parameter estimation and practical identifiability analysis.

PubMed

Hock, Sabrina; Hasenauer, Jan; Theis, Fabian J

2013-01-01

Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters. We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes. As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.
Bayesian Image Segmentations by Potts Prior and Loopy Belief Propagation

NASA Astrophysics Data System (ADS)

Tanaka, Kazuyuki; Kataoka, Shun; Yasuda, Muneki; Waizumi, Yuji; Hsu, Chiou-Ting

2014-12-01

This paper presents a Bayesian image segmentation model based on Potts prior and loopy belief propagation. The proposed Bayesian model involves several terms, including the pairwise interactions of Potts models, and the average vectors and covariant matrices of Gauss distributions in color image modeling. These terms are often referred to as hyperparameters in statistical machine learning theory. In order to determine these hyperparameters, we propose a new scheme for hyperparameter estimation based on conditional maximization of entropy in the Potts prior. The algorithm is given based on loopy belief propagation. In addition, we compare our conditional maximum entropy framework with the conventional maximum likelihood framework, and also clarify how the first order phase transitions in loopy belief propagations for Potts models influence our hyperparameter estimation procedures.
Some Small Sample Results for Maximum Likelihood Estimation in Multidimensional Scaling.

ERIC Educational Resources Information Center

Ramsay, J. O.

1980-01-01

Some aspects of the small sample behavior of maximum likelihood estimates in multidimensional scaling are investigated with Monte Carlo techniques. In particular, the chi square test for dimensionality is examined and a correction for bias is proposed and evaluated. (Author/JKS)
A spatially explicit capture-recapture estimator for single-catch traps.

PubMed

Distiller, Greg; Borchers, David L

2015-11-01

Single-catch traps are frequently used in live-trapping studies of small mammals. Thus far, a likelihood for single-catch traps has proven elusive and usually the likelihood for multicatch traps is used for spatially explicit capture-recapture (SECR) analyses of such data. Previous work found the multicatch likelihood to provide a robust estimator of average density. We build on a recently developed continuous-time model for SECR to derive a likelihood for single-catch traps. We use this to develop an estimator based on observed capture times and compare its performance by simulation to that of the multicatch estimator for various scenarios with nonconstant density surfaces. While the multicatch estimator is found to be a surprisingly robust estimator of average density, its performance deteriorates with high trap saturation and increasing density gradients. Moreover, it is found to be a poor estimator of the height of the detection function. By contrast, the single-catch estimators of density, distribution, and detection function parameters are found to be unbiased or nearly unbiased in all scenarios considered. This gain comes at the cost of higher variance. If there is no interest in interpreting the detection function parameters themselves, and if density is expected to be fairly constant over the survey region, then the multicatch estimator performs well with single-catch traps. However if accurate estimation of the detection function is of interest, or if density is expected to vary substantially in space, then there is merit in using the single-catch estimator when trap saturation is above about 60%. The estimator's performance is improved if care is taken to place traps so as to span the range of variables that affect animal distribution. As a single-catch likelihood with unknown capture times remains intractable for now, researchers using single-catch traps should aim to incorporate timing devices with their traps.

Maintained Individual Data Distributed Likelihood Estimation (MIDDLE)

PubMed Central

Boker, Steven M.; Brick, Timothy R.; Pritikin, Joshua N.; Wang, Yang; von Oertzen, Timo; Brown, Donald; Lach, John; Estabrook, Ryne; Hunter, Michael D.; Maes, Hermine H.; Neale, Michael C.

2015-01-01

Maintained Individual Data Distributed Likelihood Estimation (MIDDLE) is a novel paradigm for research in the behavioral, social, and health sciences. The MIDDLE approach is based on the seemingly-impossible idea that data can be privately maintained by participants and never revealed to researchers, while still enabling statistical models to be fit and scientific hypotheses tested. MIDDLE rests on the assumption that participant data should belong to, be controlled by, and remain in the possession of the participants themselves. Distributed likelihood estimation refers to fitting statistical models by sending an objective function and vector of parameters to each participants’ personal device (e.g., smartphone, tablet, computer), where the likelihood of that individual’s data is calculated locally. Only the likelihood value is returned to the central optimizer. The optimizer aggregates likelihood values from responding participants and chooses new vectors of parameters until the model converges. A MIDDLE study provides significantly greater privacy for participants, automatic management of opt-in and opt-out consent, lower cost for the researcher and funding institute, and faster determination of results. Furthermore, if a participant opts into several studies simultaneously and opts into data sharing, these studies automatically have access to individual-level longitudinal data linked across all studies. PMID:26717128
Generalized weighted likelihood density estimators with application to finite mixture of exponential family distributions

PubMed Central

Zhan, Tingting; Chevoneva, Inna; Iglewicz, Boris

2010-01-01

The family of weighted likelihood estimators largely overlaps with minimum divergence estimators. They are robust to data contaminations compared to MLE. We define the class of generalized weighted likelihood estimators (GWLE), provide its influence function and discuss the efficiency requirements. We introduce a new truncated cubic-inverse weight, which is both first and second order efficient and more robust than previously reported weights. We also discuss new ways of selecting the smoothing bandwidth and weighted starting values for the iterative algorithm. The advantage of the truncated cubic-inverse weight is illustrated in a simulation study of three-components normal mixtures model with large overlaps and heavy contaminations. A real data example is also provided. PMID:20835375
Bayesian structural equation modeling in sport and exercise psychology.

PubMed

Stenling, Andreas; Ivarsson, Andreas; Johnson, Urban; Lindwall, Magnus

2015-08-01

Bayesian statistics is on the rise in mainstream psychology, but applications in sport and exercise psychology research are scarce. In this article, the foundations of Bayesian analysis are introduced, and we will illustrate how to apply Bayesian structural equation modeling in a sport and exercise psychology setting. More specifically, we contrasted a confirmatory factor analysis on the Sport Motivation Scale II estimated with the most commonly used estimator, maximum likelihood, and a Bayesian approach with weakly informative priors for cross-loadings and correlated residuals. The results indicated that the model with Bayesian estimation and weakly informative priors provided a good fit to the data, whereas the model estimated with a maximum likelihood estimator did not produce a well-fitting model. The reasons for this discrepancy between maximum likelihood and Bayesian estimation are discussed as well as potential advantages and caveats with the Bayesian approach.
Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations

NASA Astrophysics Data System (ADS)

Bovy Jo; Hogg, David W.; Roweis, Sam T.

2011-06-01

We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation-Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual d-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-"erge- procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional veloc! ity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.
New estimates of the CMB angular power spectra from the WMAP 5 year low-resolution data

NASA Astrophysics Data System (ADS)

Gruppuso, A.; de Rosa, A.; Cabella, P.; Paci, F.; Finelli, F.; Natoli, P.; de Gasperis, G.; Mandolesi, N.

2009-11-01

A quadratic maximum likelihood (QML) estimator is applied to the Wilkinson Microwave Anisotropy Probe (WMAP) 5 year low-resolution maps to compute the cosmic microwave background angular power spectra (APS) at large scales for both temperature and polarization. Estimates and error bars for the six APS are provided up to l = 32 and compared, when possible, to those obtained by the WMAP team, without finding any inconsistency. The conditional likelihood slices are also computed for the Cl of all the six power spectra from l = 2 to 10 through a pixel-based likelihood code. Both the codes treat the covariance for (T, Q, U) in a single matrix without employing any approximation. The inputs of both the codes (foreground-reduced maps, related covariances and masks) are provided by the WMAP team. The peaks of the likelihood slices are always consistent with the QML estimates within the error bars; however, an excellent agreement occurs when the QML estimates are used as a fiducial power spectrum instead of the best-fitting theoretical power spectrum. By the full computation of the conditional likelihood on the estimated spectra, the value of the temperature quadrupole CTTl=2 is found to be less than 2σ away from the WMAP 5 year Λ cold dark matter best-fitting value. The BB spectrum is found to be well consistent with zero, and upper limits on the B modes are provided. The parity odd signals TB and EB are found to be consistent with zero.
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

NASA Astrophysics Data System (ADS)

Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

2017-05-01

Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
Computing maximum-likelihood estimates for parameters of the National Descriptive Model of Mercury in Fish

USGS Publications Warehouse

Donato, David I.

2012-01-01

This report presents the mathematical expressions and the computational techniques required to compute maximum-likelihood estimates for the parameters of the National Descriptive Model of Mercury in Fish (NDMMF), a statistical model used to predict the concentration of methylmercury in fish tissue. The expressions and techniques reported here were prepared to support the development of custom software capable of computing NDMMF parameter estimates more quickly and using less computer memory than is currently possible with available general-purpose statistical software. Computation of maximum-likelihood estimates for the NDMMF by numerical solution of a system of simultaneous equations through repeated Newton-Raphson iterations is described. This report explains the derivation of the mathematical expressions required for computational parameter estimation in sufficient detail to facilitate future derivations for any revised versions of the NDMMF that may be developed.
Forest inventory using multistage sampling with probability proportional to size. [Brazil

NASA Technical Reports Server (NTRS)

Parada, N. D. J. (Principal Investigator); Lee, D. C. L.; Hernandezfilho, P.; Shimabukuro, Y. E.; Deassis, O. R.; Demedeiros, J. S.

1984-01-01

A multistage sampling technique, with probability proportional to size, for forest volume inventory using remote sensing data is developed and evaluated. The study area is located in the Southeastern Brazil. The LANDSAT 4 digital data of the study area are used in the first stage for automatic classification of reforested areas. Four classes of pine and eucalypt with different tree volumes are classified utilizing a maximum likelihood classification algorithm. Color infrared aerial photographs are utilized in the second stage of sampling. In the third state (ground level) the time volume of each class is determined. The total time volume of each class is expanded through a statistical procedure taking into account all the three stages of sampling. This procedure results in an accurate time volume estimate with a smaller number of aerial photographs and reduced time in field work.
Predicting bulk permeability using outcrop fracture attributes: The benefits of a Maximum Likelihood Estimator

NASA Astrophysics Data System (ADS)

Rizzo, R. E.; Healy, D.; De Siena, L.

2015-12-01

The success of any model prediction is largely dependent on the accuracy with which its parameters are known. In characterising fracture networks in naturally fractured rocks, the main issues are related with the difficulties in accurately up- and down-scaling the parameters governing the distribution of fracture attributes. Optimal characterisation and analysis of fracture attributes (fracture lengths, apertures, orientations and densities) represents a fundamental step which can aid the estimation of permeability and fluid flow, which are of primary importance in a number of contexts ranging from hydrocarbon production in fractured reservoirs and reservoir stimulation by hydrofracturing, to geothermal energy extraction and deeper Earth systems, such as earthquakes and ocean floor hydrothermal venting. This work focuses on linking fracture data collected directly from outcrops to permeability estimation and fracture network modelling. Outcrop studies can supplement the limited data inherent to natural fractured systems in the subsurface. The study area is a highly fractured upper Miocene biosiliceous mudstone formation cropping out along the coastline north of Santa Cruz (California, USA). These unique outcrops exposes a recently active bitumen-bearing formation representing a geological analogue of a fractured top seal. In order to validate field observations as useful analogues of subsurface reservoirs, we describe a methodology of statistical analysis for more accurate probability distribution of fracture attributes, using Maximum Likelihood Estimators. These procedures aim to understand whether the average permeability of a fracture network can be predicted reducing its uncertainties, and if outcrop measurements of fracture attributes can be used directly to generate statistically identical fracture network models.
Maximum Likelihood Estimation of Nonlinear Structural Equation Models.

ERIC Educational Resources Information Center

Lee, Sik-Yum; Zhu, Hong-Tu

2002-01-01

Developed an EM type algorithm for maximum likelihood estimation of a general nonlinear structural equation model in which the E-step is completed by a Metropolis-Hastings algorithm. Illustrated the methodology with results from a simulation study and two real examples using data from previous studies. (SLD)
Time-series analyses of air pollution and mortality in the United States: a subsampling approach.

PubMed

Moolgavkar, Suresh H; McClellan, Roger O; Dewanji, Anup; Turim, Jay; Luebeck, E Georg; Edwards, Melanie

2013-01-01

Hierarchical Bayesian methods have been used in previous papers to estimate national mean effects of air pollutants on daily deaths in time-series analyses. We obtained maximum likelihood estimates of the common national effects of the criteria pollutants on mortality based on time-series data from ≤ 108 metropolitan areas in the United States. We used a subsampling bootstrap procedure to obtain the maximum likelihood estimates and confidence bounds for common national effects of the criteria pollutants, as measured by the percentage increase in daily mortality associated with a unit increase in daily 24-hr mean pollutant concentration on the previous day, while controlling for weather and temporal trends. We considered five pollutants [PM10, ozone (O3), carbon monoxide (CO), nitrogen dioxide (NO2), and sulfur dioxide (SO2)] in single- and multipollutant analyses. Flexible ambient concentration-response models for the pollutant effects were considered as well. We performed limited sensitivity analyses with different degrees of freedom for time trends. In single-pollutant models, we observed significant associations of daily deaths with all pollutants. The O3 coefficient was highly sensitive to the degree of smoothing of time trends. Among the gases, SO2 and NO2 were most strongly associated with mortality. The flexible ambient concentration-response curve for O3 showed evidence of nonlinearity and a threshold at about 30 ppb. Differences between the results of our analyses and those reported from using the Bayesian approach suggest that estimates of the quantitative impact of pollutants depend on the choice of statistical approach, although results are not directly comparable because they are based on different data. In addition, the estimate of the O3-mortality coefficient depends on the amount of smoothing of time trends.
A New Monte Carlo Method for Estimating Marginal Likelihoods.

PubMed

Wang, Yu-Bo; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O

2018-06-01

Evaluating the marginal likelihood in Bayesian analysis is essential for model selection. Estimators based on a single Markov chain Monte Carlo sample from the posterior distribution include the harmonic mean estimator and the inflated density ratio estimator. We propose a new class of Monte Carlo estimators based on this single Markov chain Monte Carlo sample. This class can be thought of as a generalization of the harmonic mean and inflated density ratio estimators using a partition weighted kernel (likelihood times prior). We show that our estimator is consistent and has better theoretical properties than the harmonic mean and inflated density ratio estimators. In addition, we provide guidelines on choosing optimal weights. Simulation studies were conducted to examine the empirical performance of the proposed estimator. We further demonstrate the desirable features of the proposed estimator with two real data sets: one is from a prostate cancer study using an ordinal probit regression model with latent variables; the other is for the power prior construction from two Eastern Cooperative Oncology Group phase III clinical trials using the cure rate survival model with similar objectives.
Mixture Rasch Models with Joint Maximum Likelihood Estimation

ERIC Educational Resources Information Center

Willse, John T.

2011-01-01

This research provides a demonstration of the utility of mixture Rasch models. Specifically, a model capable of estimating a mixture partial credit model using joint maximum likelihood is presented. Like the partial credit model, the mixture partial credit model has the beneficial feature of being appropriate for analysis of assessment data…
Bayesian Monte Carlo and Maximum Likelihood Approach for Uncertainty Estimation and Risk Management: Application to Lake Oxygen Recovery Model

EPA Science Inventory

Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood e...
The Effects of Model Misspecification and Sample Size on LISREL Maximum Likelihood Estimates.

ERIC Educational Resources Information Center

Baldwin, Beatrice

The robustness of LISREL computer program maximum likelihood estimates under specific conditions of model misspecification and sample size was examined. The population model used in this study contains one exogenous variable; three endogenous variables; and eight indicator variables, two for each latent variable. Conditions of model…
Propagating Water Quality Analysis Uncertainty Into Resource Management Decisions Through Probabilistic Modeling

NASA Astrophysics Data System (ADS)

Gronewold, A. D.; Wolpert, R. L.; Reckhow, K. H.

2007-12-01

Most probable number (MPN) and colony-forming-unit (CFU) are two estimates of fecal coliform bacteria concentration commonly used as measures of water quality in United States shellfish harvesting waters. The MPN is the maximum likelihood estimate (or MLE) of the true fecal coliform concentration based on counts of non-sterile tubes in serial dilution of a sample aliquot, indicating bacterial metabolic activity. The CFU is the MLE of the true fecal coliform concentration based on the number of bacteria colonies emerging on a growth plate after inoculation from a sample aliquot. Each estimating procedure has intrinsic variability and is subject to additional uncertainty arising from minor variations in experimental protocol. Several versions of each procedure (using different sized aliquots or different numbers of tubes, for example) are in common use, each with its own levels of probabilistic and experimental error and uncertainty. It has been observed empirically that the MPN procedure is more variable than the CFU procedure, and that MPN estimates are somewhat higher on average than CFU estimates, on split samples from the same water bodies. We construct a probabilistic model that provides a clear theoretical explanation for the observed variability in, and discrepancy between, MPN and CFU measurements. We then explore how this variability and uncertainty might propagate into shellfish harvesting area management decisions through a two-phased modeling strategy. First, we apply our probabilistic model in a simulation-based analysis of future water quality standard violation frequencies under alternative land use scenarios, such as those evaluated under guidelines of the total maximum daily load (TMDL) program. Second, we apply our model to water quality data from shellfish harvesting areas which at present are closed (either conditionally or permanently) to shellfishing, to determine if alternative laboratory analysis procedures might have led to different management decisions. Our research results indicate that the (often large) observed differences between MPN and CFU values for the same water body are well within the ranges predicted by our probabilistic model. Our research also indicates that the probability of violating current water quality guidelines at specified true fecal coliform concentrations depends on the laboratory procedure used. As a result, quality-based management decisions, such as opening or closing a shellfishing area, may also depend on the laboratory procedure used.
Maximum likelihood estimates, from censored data, for mixed-Weibull distributions

NASA Astrophysics Data System (ADS)

Jiang, Siyuan; Kececioglu, Dimitri

1992-06-01

A new algorithm for estimating the parameters of mixed-Weibull distributions from censored data is presented. The algorithm follows the principle of maximum likelihood estimate (MLE) through the expectation and maximization (EM) algorithm, and it is derived for both postmortem and nonpostmortem time-to-failure data. It is concluded that the concept of the EM algorithm is easy to understand and apply (only elementary statistics and calculus are required). The log-likelihood function cannot decrease after an EM sequence; this important feature was observed in all of the numerical calculations. The MLEs of the nonpostmortem data were obtained successfully for mixed-Weibull distributions with up to 14 parameters in a 5-subpopulation, mixed-Weibull distribution. Numerical examples indicate that some of the log-likelihood functions of the mixed-Weibull distributions have multiple local maxima; therefore, the algorithm should start at several initial guesses of the parameter set.
Log Pearson type 3 quantile estimators with regional skew information and low outlier adjustments

USGS Publications Warehouse

Griffis, V.W.; Stedinger, Jery R.; Cohn, T.A.

2004-01-01

The recently developed expected moments algorithm (EMA) [Cohn et al., 1997] does as well as maximum likelihood estimations at estimating log‐Pearson type 3 (LP3) flood quantiles using systematic and historical flood information. Needed extensions include use of a regional skewness estimator and its precision to be consistent with Bulletin 17B. Another issue addressed by Bulletin 17B is the treatment of low outliers. A Monte Carlo study compares the performance of Bulletin 17B using the entire sample with and without regional skew with estimators that use regional skew and censor low outliers, including an extended EMA estimator, the conditional probability adjustment (CPA) from Bulletin 17B, and an estimator that uses probability plot regression (PPR) to compute substitute values for low outliers. Estimators that neglect regional skew information do much worse than estimators that use an informative regional skewness estimator. For LP3 data the low outlier rejection procedure generally results in no loss of overall accuracy, and the differences between the MSEs of the estimators that used an informative regional skew are generally modest in the skewness range of real interest. Samples contaminated to model actual flood data demonstrate that estimators which give special treatment to low outliers significantly outperform estimators that make no such adjustment.
Log Pearson type 3 quantile estimators with regional skew information and low outlier adjustments

NASA Astrophysics Data System (ADS)

Griffis, V. W.; Stedinger, J. R.; Cohn, T. A.

2004-07-01

The recently developed expected moments algorithm (EMA) [, 1997] does as well as maximum likelihood estimations at estimating log-Pearson type 3 (LP3) flood quantiles using systematic and historical flood information. Needed extensions include use of a regional skewness estimator and its precision to be consistent with Bulletin 17B. Another issue addressed by Bulletin 17B is the treatment of low outliers. A Monte Carlo study compares the performance of Bulletin 17B using the entire sample with and without regional skew with estimators that use regional skew and censor low outliers, including an extended EMA estimator, the conditional probability adjustment (CPA) from Bulletin 17B, and an estimator that uses probability plot regression (PPR) to compute substitute values for low outliers. Estimators that neglect regional skew information do much worse than estimators that use an informative regional skewness estimator. For LP3 data the low outlier rejection procedure generally results in no loss of overall accuracy, and the differences between the MSEs of the estimators that used an informative regional skew are generally modest in the skewness range of real interest. Samples contaminated to model actual flood data demonstrate that estimators which give special treatment to low outliers significantly outperform estimators that make no such adjustment.
Likelihood-based confidence intervals for estimating floods with given return periods

NASA Astrophysics Data System (ADS)

Martins, Eduardo Sávio P. R.; Clarke, Robin T.

1993-06-01

This paper discusses aspects of the calculation of likelihood-based confidence intervals for T-year floods, with particular reference to (1) the two-parameter gamma distribution; (2) the Gumbel distribution; (3) the two-parameter log-normal distribution, and other distributions related to the normal by Box-Cox transformations. Calculation of the confidence limits is straightforward using the Nelder-Mead algorithm with a constraint incorporated, although care is necessary to ensure convergence either of the Nelder-Mead algorithm, or of the Newton-Raphson calculation of maximum-likelihood estimates. Methods are illustrated using records from 18 gauging stations in the basin of the River Itajai-Acu, State of Santa Catarina, southern Brazil. A small and restricted simulation compared likelihood-based confidence limits with those given by use of the central limit theorem; for the same confidence probability, the confidence limits of the simulation were wider than those of the central limit theorem, which failed more frequently to contain the true quantile being estimated. The paper discusses possible applications of likelihood-based confidence intervals in other areas of hydrological analysis.

Three methods to construct predictive models using logistic regression and likelihood ratios to facilitate adjustment for pretest probability give similar results.

PubMed

Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les

2008-01-01

To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Performance of time-varying predictors in multilevel models under an assumption of fixed or random effects.

PubMed

Baird, Rachel; Maxwell, Scott E

2016-06-01

Time-varying predictors in multilevel models are a useful tool for longitudinal research, whether they are the research variable of interest or they are controlling for variance to allow greater power for other variables. However, standard recommendations to fix the effect of time-varying predictors may make an assumption that is unlikely to hold in reality and may influence results. A simulation study illustrates that treating the time-varying predictor as fixed may allow analyses to converge, but the analyses have poor coverage of the true fixed effect when the time-varying predictor has a random effect in reality. A second simulation study shows that treating the time-varying predictor as random may have poor convergence, except when allowing negative variance estimates. Although negative variance estimates are uninterpretable, results of the simulation show that estimates of the fixed effect of the time-varying predictor are as accurate for these cases as for cases with positive variance estimates, and that treating the time-varying predictor as random and allowing negative variance estimates performs well whether the time-varying predictor is fixed or random in reality. Because of the difficulty of interpreting negative variance estimates, 2 procedures are suggested for selection between fixed-effect and random-effect models: comparing between fixed-effect and constrained random-effect models with a likelihood ratio test or fitting a fixed-effect model when an unconstrained random-effect model produces negative variance estimates. The performance of these 2 procedures is compared. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Neural Mechanisms for Integrating Prior Knowledge and Likelihood in Value-Based Probabilistic Inference

PubMed Central

Ting, Chih-Chung; Yu, Chia-Chen; Maloney, Laurence T.

2015-01-01

In Bayesian decision theory, knowledge about the probabilities of possible outcomes is captured by a prior distribution and a likelihood function. The prior reflects past knowledge and the likelihood summarizes current sensory information. The two combined (integrated) form a posterior distribution that allows estimation of the probability of different possible outcomes. In this study, we investigated the neural mechanisms underlying Bayesian integration using a novel lottery decision task in which both prior knowledge and likelihood information about reward probability were systematically manipulated on a trial-by-trial basis. Consistent with Bayesian integration, as sample size increased, subjects tended to weigh likelihood information more compared with prior information. Using fMRI in humans, we found that the medial prefrontal cortex (mPFC) correlated with the mean of the posterior distribution, a statistic that reflects the integration of prior knowledge and likelihood of reward probability. Subsequent analysis revealed that both prior and likelihood information were represented in mPFC and that the neural representations of prior and likelihood in mPFC reflected changes in the behaviorally estimated weights assigned to these different sources of information in response to changes in the environment. Together, these results establish the role of mPFC in prior-likelihood integration and highlight its involvement in representing and integrating these distinct sources of information. PMID:25632152
Maximum likelihood estimation of label imperfections and its use in the identification of mislabeled patterns

NASA Technical Reports Server (NTRS)

Chittineni, C. B.

1979-01-01

The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior.

PubMed

Pooseh, Shakoor; Bernhardt, Nadine; Guevara, Alvaro; Huys, Quentin J M; Smolka, Michael N

2018-02-01

Using simple mathematical models of choice behavior, we present a Bayesian adaptive algorithm to assess measures of impulsive and risky decision making. Practically, these measures are characterized by discounting rates and are used to classify individuals or population groups, to distinguish unhealthy behavior, and to predict developmental courses. However, a constant demand for improved tools to assess these constructs remains unanswered. The algorithm is based on trial-by-trial observations. At each step, a choice is made between immediate (certain) and delayed (risky) options. Then the current parameter estimates are updated by the likelihood of observing the choice, and the next offers are provided from the indifference point, so that they will acquire the most informative data based on the current parameter estimates. The procedure continues for a certain number of trials in order to reach a stable estimation. The algorithm is discussed in detail for the delay discounting case, and results from decision making under risk for gains, losses, and mixed prospects are also provided. Simulated experiments using prescribed parameter values were performed to justify the algorithm in terms of the reproducibility of its parameters for individual assessments, and to test the reliability of the estimation procedure in a group-level analysis. The algorithm was implemented as an experimental battery to measure temporal and probability discounting rates together with loss aversion, and was tested on a healthy participant sample.
On the Performance of Maximum Likelihood versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA

ERIC Educational Resources Information Center

Beauducel, Andre; Herzberg, Philipp Yorck

2006-01-01

This simulation study compared maximum likelihood (ML) estimation with weighted least squares means and variance adjusted (WLSMV) estimation. The study was based on confirmatory factor analyses with 1, 2, 4, and 8 factors, based on 250, 500, 750, and 1,000 cases, and on 5, 10, 20, and 40 variables with 2, 3, 4, 5, and 6 categories. There was no…
A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling

PubMed Central

Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.

2012-01-01

This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659
32 CFR 191.6 - Procedures.

Code of Federal Regulations, 2014 CFR

2014-07-01

... and a likelihood of vacancies (e.g., science and engineering positions). (11) Develop procedures for... computer support of employees with disabilities consistent with DoD participation in activities of the...
32 CFR 191.6 - Procedures.

Code of Federal Regulations, 2011 CFR

2011-07-01

... and a likelihood of vacancies (e.g., science and engineering positions). (11) Develop procedures for... computer support of employees with disabilities consistent with DoD participation in activities of the...
32 CFR 191.6 - Procedures.

Code of Federal Regulations, 2013 CFR

2013-07-01

... and a likelihood of vacancies (e.g., science and engineering positions). (11) Develop procedures for... computer support of employees with disabilities consistent with DoD participation in activities of the...
32 CFR 191.6 - Procedures.

Code of Federal Regulations, 2012 CFR

2012-07-01

... and a likelihood of vacancies (e.g., science and engineering positions). (11) Develop procedures for... computer support of employees with disabilities consistent with DoD participation in activities of the...
Quasi-Maximum Likelihood Estimation of Structural Equation Models with Multiple Interaction and Quadratic Effects

ERIC Educational Resources Information Center

Klein, Andreas G.; Muthen, Bengt O.

2007-01-01

In this article, a nonlinear structural equation model is introduced and a quasi-maximum likelihood method for simultaneous estimation and testing of multiple nonlinear effects is developed. The focus of the new methodology lies on efficiency, robustness, and computational practicability. Monte-Carlo studies indicate that the method is highly…
Likelihood-Based Confidence Intervals in Exploratory Factor Analysis

ERIC Educational Resources Information Center

Oort, Frans J.

2011-01-01

In exploratory or unrestricted factor analysis, all factor loadings are free to be estimated. In oblique solutions, the correlations between common factors are free to be estimated as well. The purpose of this article is to show how likelihood-based confidence intervals can be obtained for rotated factor loadings and factor correlations, by…
Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth

ERIC Educational Resources Information Center

Jeon, Minjeong

2012-01-01

Maximum likelihood (ML) estimation of generalized linear mixed models (GLMMs) is technically challenging because of the intractable likelihoods that involve high dimensional integrations over random effects. The problem is magnified when the random effects have a crossed design and thus the data cannot be reduced to small independent clusters. A…
The Neural Bases of Difficult Speech Comprehension and Speech Production: Two Activation Likelihood Estimation (ALE) Meta-Analyses

ERIC Educational Resources Information Center

Adank, Patti

2012-01-01

The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…
Uncertainty estimation of Intensity-Duration-Frequency relationships: A regional analysis

NASA Astrophysics Data System (ADS)

Mélèse, Victor; Blanchet, Juliette; Molinié, Gilles

2018-03-01

We propose in this article a regional study of uncertainties in IDF curves derived from point-rainfall maxima. We develop two generalized extreme value models based on the simple scaling assumption, first in the frequentist framework and second in the Bayesian framework. Within the frequentist framework, uncertainties are obtained i) from the Gaussian density stemming from the asymptotic normality theorem of the maximum likelihood and ii) with a bootstrap procedure. Within the Bayesian framework, uncertainties are obtained from the posterior densities. We confront these two frameworks on the same database covering a large region of 100, 000 km2 in southern France with contrasted rainfall regime, in order to be able to draw conclusion that are not specific to the data. The two frameworks are applied to 405 hourly stations with data back to the 1980's, accumulated in the range 3 h-120 h. We show that i) the Bayesian framework is more robust than the frequentist one to the starting point of the estimation procedure, ii) the posterior and the bootstrap densities are able to better adjust uncertainty estimation to the data than the Gaussian density, and iii) the bootstrap density give unreasonable confidence intervals, in particular for return levels associated to large return period. Therefore our recommendation goes towards the use of the Bayesian framework to compute uncertainty.
Make the most of your samples: Bayes factor estimators for high-dimensional models of sequence evolution.

PubMed

Baele, Guy; Lemey, Philippe; Vansteelandt, Stijn

2013-03-06

Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model's marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. We here assess the original 'model-switch' path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model's marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation.
Make the most of your samples: Bayes factor estimators for high-dimensional models of sequence evolution

PubMed Central

2013-01-01

Background Accurate model comparison requires extensive computation times, especially for parameter-rich models of sequence evolution. In the Bayesian framework, model selection is typically performed through the evaluation of a Bayes factor, the ratio of two marginal likelihoods (one for each model). Recently introduced techniques to estimate (log) marginal likelihoods, such as path sampling and stepping-stone sampling, offer increased accuracy over the traditional harmonic mean estimator at an increased computational cost. Most often, each model’s marginal likelihood will be estimated individually, which leads the resulting Bayes factor to suffer from errors associated with each of these independent estimation processes. Results We here assess the original ‘model-switch’ path sampling approach for direct Bayes factor estimation in phylogenetics, as well as an extension that uses more samples, to construct a direct path between two competing models, thereby eliminating the need to calculate each model’s marginal likelihood independently. Further, we provide a competing Bayes factor estimator using an adaptation of the recently introduced stepping-stone sampling algorithm and set out to determine appropriate settings for accurately calculating such Bayes factors, with context-dependent evolutionary models as an example. While we show that modest efforts are required to roughly identify the increase in model fit, only drastically increased computation times ensure the accuracy needed to detect more subtle details of the evolutionary process. Conclusions We show that our adaptation of stepping-stone sampling for direct Bayes factor calculation outperforms the original path sampling approach as well as an extension that exploits more samples. Our proposed approach for Bayes factor estimation also has preferable statistical properties over the use of individual marginal likelihood estimates for both models under comparison. Assuming a sigmoid function to determine the path between two competing models, we provide evidence that a single well-chosen sigmoid shape value requires less computational efforts in order to approximate the true value of the (log) Bayes factor compared to the original approach. We show that the (log) Bayes factors calculated using path sampling and stepping-stone sampling differ drastically from those estimated using either of the harmonic mean estimators, supporting earlier claims that the latter systematically overestimate the performance of high-dimensional models, which we show can lead to erroneous conclusions. Based on our results, we argue that highly accurate estimation of differences in model fit for high-dimensional models requires much more computational effort than suggested in recent studies on marginal likelihood estimation. PMID:23497171
Estimation After a Group Sequential Trial.

PubMed

Milanzi, Elasma; Molenberghs, Geert; Alonso, Ariel; Kenward, Michael G; Tsiatis, Anastasios A; Davidian, Marie; Verbeke, Geert

2015-10-01

Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2 n . In this paper, we consider the more practically useful setting of sample sizes in a the finite set { n 1 , n 2 , …, n L }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.
Cosmological Parameters and Hyper-Parameters: The Hubble Constant from Boomerang and Maxima

NASA Astrophysics Data System (ADS)

Lahav, Ofer

Recently several studies have jointly analysed data from different cosmological probes with the motivation of estimating cosmological parameters. Here we generalise this procedure to allow freedom in the relative weights of various probes. This is done by including in the joint likelihood function a set of `Hyper-Parameters', which are dealt with using Bayesian considerations. The resulting algorithm, which assumes uniform priors on the log of the Hyper-Parameters, is very simple to implement. We illustrate the method by estimating the Hubble constant H0 from different sets of recent CMB experiments (including Saskatoon, Python V, MSAM1, TOCO, Boomerang and Maxima). The approach can be generalised for a combination of cosmic probes, and for other priors on the Hyper-Parameters. Reference: Lahav, Bridle, Hobson, Lasenby & Sodre, 2000, MNRAS, in press (astro-ph/9912105)

Empirical Bayes Gaussian likelihood estimation of exposure distributions from pooled samples in human biomonitoring.

PubMed

Li, Xiang; Kuk, Anthony Y C; Xu, Jinfeng

2014-12-10

Human biomonitoring of exposure to environmental chemicals is important. Individual monitoring is not viable because of low individual exposure level or insufficient volume of materials and the prohibitive cost of taking measurements from many subjects. Pooling of samples is an efficient and cost-effective way to collect data. Estimation is, however, complicated as individual values within each pool are not observed but are only known up to their average or weighted average. The distribution of such averages is intractable when the individual measurements are lognormally distributed, which is a common assumption. We propose to replace the intractable distribution of the pool averages by a Gaussian likelihood to obtain parameter estimates. If the pool size is large, this method produces statistically efficient estimates, but regardless of pool size, the method yields consistent estimates as the number of pools increases. An empirical Bayes (EB) Gaussian likelihood approach, as well as its Bayesian analog, is developed to pool information from various demographic groups by using a mixed-effect formulation. We also discuss methods to estimate the underlying mean-variance relationship and to select a good model for the means, which can be incorporated into the proposed EB or Bayes framework. By borrowing strength across groups, the EB estimator is more efficient than the individual group-specific estimator. Simulation results show that the EB Gaussian likelihood estimates outperform a previous method proposed for the National Health and Nutrition Examination Surveys with much smaller bias and better coverage in interval estimation, especially after correction of bias. Copyright © 2014 John Wiley & Sons, Ltd.
The exponential-Poisson model for recurrent event data: an application to a set of data on malaria in Brazil.

PubMed

Macera, Márcia A C; Louzada, Francisco; Cancho, Vicente G; Fontes, Cor J F

2015-03-01

In this paper, we introduce a new model for recurrent event data characterized by a baseline rate function fully parametric, which is based on the exponential-Poisson distribution. The model arises from a latent competing risk scenario, in the sense that there is no information about which cause was responsible for the event occurrence. Then, the time of each recurrence is given by the minimum lifetime value among all latent causes. The new model has a particular case, which is the classical homogeneous Poisson process. The properties of the proposed model are discussed, including its hazard rate function, survival function, and ordinary moments. The inferential procedure is based on the maximum likelihood approach. We consider an important issue of model selection between the proposed model and its particular case by the likelihood ratio test and score test. Goodness of fit of the recurrent event models is assessed using Cox-Snell residuals. A simulation study evaluates the performance of the estimation procedure in the presence of a small and moderate sample sizes. Applications on two real data sets are provided to illustrate the proposed methodology. One of them, first analyzed by our team of researchers, considers the data concerning the recurrence of malaria, which is an infectious disease caused by a protozoan parasite that infects red blood cells. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Climate reconstruction analysis using coexistence likelihood estimation (CRACLE): a method for the estimation of climate using vegetation.

PubMed

Harbert, Robert S; Nixon, Kevin C

2015-08-01

• Plant distributions have long been understood to be correlated with the environmental conditions to which species are adapted. Climate is one of the major components driving species distributions. Therefore, it is expected that the plants coexisting in a community are reflective of the local environment, particularly climate.• Presented here is a method for the estimation of climate from local plant species coexistence data. The method, Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE), is a likelihood-based method that employs specimen collection data at a global scale for the inference of species climate tolerance. CRACLE calculates the maximum joint likelihood of coexistence given individual species climate tolerance characterization to estimate the expected climate.• Plant distribution data for more than 4000 species were used to show that this method accurately infers expected climate profiles for 165 sites with diverse climatic conditions. Estimates differ from the WorldClim global climate model by less than 1.5°C on average for mean annual temperature and less than ∼250 mm for mean annual precipitation. This is a significant improvement upon other plant-based climate-proxy methods.• CRACLE validates long hypothesized interactions between climate and local associations of plant species. Furthermore, CRACLE successfully estimates climate that is consistent with the widely used WorldClim model and therefore may be applied to the quantitative estimation of paleoclimate in future studies. © 2015 Botanical Society of America, Inc.
Maximum-likelihood methods in wavefront sensing: stochastic models and likelihood functions

PubMed Central

Barrett, Harrison H.; Dainty, Christopher; Lara, David

2008-01-01

Maximum-likelihood (ML) estimation in wavefront sensing requires careful attention to all noise sources and all factors that influence the sensor data. We present detailed probability density functions for the output of the image detector in a wavefront sensor, conditional not only on wavefront parameters but also on various nuisance parameters. Practical ways of dealing with nuisance parameters are described, and final expressions for likelihoods and Fisher information matrices are derived. The theory is illustrated by discussing Shack–Hartmann sensors, and computational requirements are discussed. Simulation results show that ML estimation can significantly increase the dynamic range of a Shack–Hartmann sensor with four detectors and that it can reduce the residual wavefront error when compared with traditional methods. PMID:17206255
Statistical Analysis of Q-matrix Based Diagnostic Classification Models

PubMed Central

Chen, Yunxiao; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

2014-01-01

Diagnostic classification models have recently gained prominence in educational assessment, psychiatric evaluation, and many other disciplines. Central to the model specification is the so-called Q-matrix that provides a qualitative specification of the item-attribute relationship. In this paper, we develop theories on the identifiability for the Q-matrix under the DINA and the DINO models. We further propose an estimation procedure for the Q-matrix through the regularized maximum likelihood. The applicability of this procedure is not limited to the DINA or the DINO model and it can be applied to essentially all Q-matrix based diagnostic classification models. Simulation studies are conducted to illustrate its performance. Furthermore, two case studies are presented. The first case is a data set on fraction subtraction (educational application) and the second case is a subsample of the National Epidemiological Survey on Alcohol and Related Conditions concerning the social anxiety disorder (psychiatric application). PMID:26294801
Impact of Violation of the Missing-at-Random Assumption on Full-Information Maximum Likelihood Method in Multidimensional Adaptive Testing

ERIC Educational Resources Information Center

Han, Kyung T.; Guo, Fanmin

2014-01-01

The full-information maximum likelihood (FIML) method makes it possible to estimate and analyze structural equation models (SEM) even when data are partially missing, enabling incomplete data to contribute to model estimation. The cornerstone of FIML is the missing-at-random (MAR) assumption. In (unidimensional) computerized adaptive testing…
Constrained Maximum Likelihood Estimation for Two-Level Mean and Covariance Structure Models

ERIC Educational Resources Information Center

Bentler, Peter M.; Liang, Jiajuan; Tang, Man-Lai; Yuan, Ke-Hai

2011-01-01

Maximum likelihood is commonly used for the estimation of model parameters in the analysis of two-level structural equation models. Constraints on model parameters could be encountered in some situations such as equal factor loadings for different factors. Linear constraints are the most common ones and they are relatively easy to handle in…
Computing Maximum Likelihood Estimates of Loglinear Models from Marginal Sums with Special Attention to Loglinear Item Response Theory.

ERIC Educational Resources Information Center

Kelderman, Henk

1992-01-01

Describes algorithms used in the computer program LOGIMO for obtaining maximum likelihood estimates of the parameters in loglinear models. These algorithms are also useful for the analysis of loglinear item-response theory models. Presents modified versions of the iterative proportional fitting and Newton-Raphson algorithms. Simulated data…
Recovery of Graded Response Model Parameters: A Comparison of Marginal Maximum Likelihood and Markov Chain Monte Carlo Estimation

ERIC Educational Resources Information Center

Kieftenbeld, Vincent; Natesan, Prathiba

2012-01-01

Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Bayesian logistic regression approaches to predict incorrect DRG assignment.

PubMed

Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural

2018-05-07

Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
A Study of Item Bias for Attitudinal Measurement Using Maximum Likelihood Factor Analysis.

ERIC Educational Resources Information Center

Mayberry, Paul W.

A technique for detecting item bias that is responsive to attitudinal measurement considerations is a maximum likelihood factor analysis procedure comparing multivariate factor structures across various subpopulations, often referred to as SIFASP. The SIFASP technique allows for factorial model comparisons in the testing of various hypotheses…
Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

PubMed

Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

2016-12-20

Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Estimation for general birth-death processes

PubMed Central

Crawford, Forrest W.; Minin, Vladimir N.; Suchard, Marc A.

2013-01-01

Birth-death processes (BDPs) are continuous-time Markov chains that track the number of “particles” in a system over time. While widely used in population biology, genetics and ecology, statistical inference of the instantaneous particle birth and death rates remains largely limited to restrictive linear BDPs in which per-particle birth and death rates are constant. Researchers often observe the number of particles at discrete times, necessitating data augmentation procedures such as expectation-maximization (EM) to find maximum likelihood estimates. For BDPs on finite state-spaces, there are powerful matrix methods for computing the conditional expectations needed for the E-step of the EM algorithm. For BDPs on infinite state-spaces, closed-form solutions for the E-step are available for some linear models, but most previous work has resorted to time-consuming simulation. Remarkably, we show that the E-step conditional expectations can be expressed as convolutions of computable transition probabilities for any general BDP with arbitrary rates. This important observation, along with a convenient continued fraction representation of the Laplace transforms of the transition probabilities, allows for novel and efficient computation of the conditional expectations for all BDPs, eliminating the need for truncation of the state-space or costly simulation. We use this insight to derive EM algorithms that yield maximum likelihood estimation for general BDPs characterized by various rate models, including generalized linear models. We show that our Laplace convolution technique outperforms competing methods when they are available and demonstrate a technique to accelerate EM algorithm convergence. We validate our approach using synthetic data and then apply our methods to cancer cell growth and estimation of mutation parameters in microsatellite evolution. PMID:25328261
Estimation for general birth-death processes.

PubMed

Crawford, Forrest W; Minin, Vladimir N; Suchard, Marc A

2014-04-01

Birth-death processes (BDPs) are continuous-time Markov chains that track the number of "particles" in a system over time. While widely used in population biology, genetics and ecology, statistical inference of the instantaneous particle birth and death rates remains largely limited to restrictive linear BDPs in which per-particle birth and death rates are constant. Researchers often observe the number of particles at discrete times, necessitating data augmentation procedures such as expectation-maximization (EM) to find maximum likelihood estimates. For BDPs on finite state-spaces, there are powerful matrix methods for computing the conditional expectations needed for the E-step of the EM algorithm. For BDPs on infinite state-spaces, closed-form solutions for the E-step are available for some linear models, but most previous work has resorted to time-consuming simulation. Remarkably, we show that the E-step conditional expectations can be expressed as convolutions of computable transition probabilities for any general BDP with arbitrary rates. This important observation, along with a convenient continued fraction representation of the Laplace transforms of the transition probabilities, allows for novel and efficient computation of the conditional expectations for all BDPs, eliminating the need for truncation of the state-space or costly simulation. We use this insight to derive EM algorithms that yield maximum likelihood estimation for general BDPs characterized by various rate models, including generalized linear models. We show that our Laplace convolution technique outperforms competing methods when they are available and demonstrate a technique to accelerate EM algorithm convergence. We validate our approach using synthetic data and then apply our methods to cancer cell growth and estimation of mutation parameters in microsatellite evolution.
Additive hazards regression and partial likelihood estimation for ecological monitoring data across space.

PubMed

Lin, Feng-Chang; Zhu, Jun

2012-01-01

We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling

NASA Astrophysics Data System (ADS)

Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.

2017-04-01

Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
Maximum Likelihood Shift Estimation Using High Resolution Polarimetric SAR Clutter Model

NASA Astrophysics Data System (ADS)

Harant, Olivier; Bombrun, Lionel; Vasile, Gabriel; Ferro-Famil, Laurent; Gay, Michel

2011-03-01

This paper deals with a Maximum Likelihood (ML) shift estimation method in the context of High Resolution (HR) Polarimetric SAR (PolSAR) clutter. Texture modeling is exposed and the generalized ML texture tracking method is extended to the merging of various sensors. Some results on displacement estimation on the Argentiere glacier in the Mont Blanc massif using dual-pol TerraSAR-X (TSX) and quad-pol RADARSAT-2 (RS2) sensors are finally discussed.
Estimating Cosmic-Ray Spectral Parameters from Simulated Detector Responses with Detector Design Implications

NASA Technical Reports Server (NTRS)

Howell, L. W.

2001-01-01

A simple power law model consisting of a single spectral index (alpha-1) is believed to be an adequate description of the galactic cosmic-ray (GCR) proton flux at energies below 10(exp 13) eV, with a transition at knee energy (E(sub k)) to a steeper spectral index alpha-2 > alpha-1 above E(sub k). The maximum likelihood procedure is developed for estimating these three spectral parameters of the broken power law energy spectrum from simulated detector responses. These estimates and their surrounding statistical uncertainty are being used to derive the requirements in energy resolution, calorimeter size, and energy response of a proposed sampling calorimeter for the Advanced Cosmic-ray Composition Experiment for the Space Station (ACCESS). This study thereby permits instrument developers to make important trade studies in design parameters as a function of the science objectives, which is particularly important for space-based detectors where physical parameters, such as dimension and weight, impose rigorous practical limits to the design envelope.
Bias correction of risk estimates in vaccine safety studies with rare adverse events using a self-controlled case series design.

PubMed

Zeng, Chan; Newcomer, Sophia R; Glanz, Jason M; Shoup, Jo Ann; Daley, Matthew F; Hambidge, Simon J; Xu, Stanley

2013-12-15

The self-controlled case series (SCCS) method is often used to examine the temporal association between vaccination and adverse events using only data from patients who experienced such events. Conditional Poisson regression models are used to estimate incidence rate ratios, and these models perform well with large or medium-sized case samples. However, in some vaccine safety studies, the adverse events studied are rare and the maximum likelihood estimates may be biased. Several bias correction methods have been examined in case-control studies using conditional logistic regression, but none of these methods have been evaluated in studies using the SCCS design. In this study, we used simulations to evaluate 2 bias correction approaches-the Firth penalized maximum likelihood method and Cordeiro and McCullagh's bias reduction after maximum likelihood estimation-with small sample sizes in studies using the SCCS design. The simulations showed that the bias under the SCCS design with a small number of cases can be large and is also sensitive to a short risk period. The Firth correction method provides finite and less biased estimates than the maximum likelihood method and Cordeiro and McCullagh's method. However, limitations still exist when the risk period in the SCCS design is short relative to the entire observation period.
Nonparametric probability density estimation by optimization theoretic techniques

NASA Technical Reports Server (NTRS)

Scott, D. W.

1976-01-01

Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.

A comparison of maximum likelihood and other estimators of eigenvalues from several correlated Monte Carlo samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beer, M.

1980-12-01

The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that themore » use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates.« less
Cosmological parameter estimation using Particle Swarm Optimization

NASA Astrophysics Data System (ADS)

Prasad, J.; Souradeep, T.

2014-03-01

Constraining parameters of a theoretical model from observational data is an important exercise in cosmology. There are many theoretically motivated models, which demand greater number of cosmological parameters than the standard model of cosmology uses, and make the problem of parameter estimation challenging. It is a common practice to employ Bayesian formalism for parameter estimation for which, in general, likelihood surface is probed. For the standard cosmological model with six parameters, likelihood surface is quite smooth and does not have local maxima, and sampling based methods like Markov Chain Monte Carlo (MCMC) method are quite successful. However, when there are a large number of parameters or the likelihood surface is not smooth, other methods may be more effective. In this paper, we have demonstrated application of another method inspired from artificial intelligence, called Particle Swarm Optimization (PSO) for estimating cosmological parameters from Cosmic Microwave Background (CMB) data taken from the WMAP satellite.
The Extended-Image Tracking Technique Based on the Maximum Likelihood Estimation

NASA Technical Reports Server (NTRS)

Tsou, Haiping; Yan, Tsun-Yee

2000-01-01

This paper describes an extended-image tracking technique based on the maximum likelihood estimation. The target image is assume to have a known profile covering more than one element of a focal plane detector array. It is assumed that the relative position between the imager and the target is changing with time and the received target image has each of its pixels disturbed by an independent additive white Gaussian noise. When a rotation-invariant movement between imager and target is considered, the maximum likelihood based image tracking technique described in this paper is a closed-loop structure capable of providing iterative update of the movement estimate by calculating the loop feedback signals from a weighted correlation between the currently received target image and the previously estimated reference image in the transform domain. The movement estimate is then used to direct the imager to closely follow the moving target. This image tracking technique has many potential applications, including free-space optical communications and astronomy where accurate and stabilized optical pointing is essential.
Extended Kalman Doppler tracking and model determination for multi-sensor short-range radar

NASA Astrophysics Data System (ADS)

Mittermaier, Thomas J.; Siart, Uwe; Eibert, Thomas F.; Bonerz, Stefan

2016-09-01

A tracking solution for collision avoidance in industrial machine tools based on short-range millimeter-wave radar Doppler observations is presented. At the core of the tracking algorithm there is an Extended Kalman Filter (EKF) that provides dynamic estimation and localization in real-time. The underlying sensor platform consists of several homodyne continuous wave (CW) radar modules. Based on In-phase-Quadrature (IQ) processing and down-conversion, they provide only Doppler shift information about the observed target. Localization with Doppler shift estimates is a nonlinear problem that needs to be linearized before the linear KF can be applied. The accuracy of state estimation depends highly on the introduced linearization errors, the initialization and the models that represent the true physics as well as the stochastic properties. The important issue of filter consistency is addressed and an initialization procedure based on data fitting and maximum likelihood estimation is suggested. Models for both, measurement and process noise are developed. Tracking results from typical three-dimensional courses of movement at short distances in front of a multi-sensor radar platform are presented.
Estimating relative risks for common outcome using PROC NLP.

PubMed

Yu, Binbing; Wang, Zhuoqiao

2008-05-01

In cross-sectional or cohort studies with binary outcomes, it is biologically interpretable and of interest to estimate the relative risk or prevalence ratio, especially when the response rates are not rare. Several methods have been used to estimate the relative risk, among which the log-binomial models yield the maximum likelihood estimate (MLE) of the parameters. Because of restrictions on the parameter space, the log-binomial models often run into convergence problems. Some remedies, e.g., the Poisson and Cox regressions, have been proposed. However, these methods may give out-of-bound predicted response probabilities. In this paper, a new computation method using the SAS Nonlinear Programming (NLP) procedure is proposed to find the MLEs. The proposed NLP method was compared to the COPY method, a modified method to fit the log-binomial model. Issues in the implementation are discussed. For illustration, both methods were applied to data on the prevalence of microalbuminuria (micro-protein leakage into urine) for kidney disease patients from the Diabetes Control and Complications Trial. The sample SAS macro for calculating relative risk is provided in the appendix.
An Iterative Maximum a Posteriori Estimation of Proficiency Level to Detect Multiple Local Likelihood Maxima

ERIC Educational Resources Information Center

Magis, David; Raiche, Gilles

2010-01-01

In this article the authors focus on the issue of the nonuniqueness of the maximum likelihood (ML) estimator of proficiency level in item response theory (with special attention to logistic models). The usual maximum a posteriori (MAP) method offers a good alternative within that framework; however, this article highlights some drawbacks of its…
Theoretical Analysis of Penalized Maximum-Likelihood Patlak Parametric Image Reconstruction in Dynamic PET for Lesion Detection.

PubMed

Yang, Li; Wang, Guobao; Qi, Jinyi

2016-04-01

Detecting cancerous lesions is a major clinical application of emission tomography. In a previous work, we studied penalized maximum-likelihood (PML) image reconstruction for lesion detection in static PET. Here we extend our theoretical analysis of static PET reconstruction to dynamic PET. We study both the conventional indirect reconstruction and direct reconstruction for Patlak parametric image estimation. In indirect reconstruction, Patlak parametric images are generated by first reconstructing a sequence of dynamic PET images, and then performing Patlak analysis on the time activity curves (TACs) pixel-by-pixel. In direct reconstruction, Patlak parametric images are estimated directly from raw sinogram data by incorporating the Patlak model into the image reconstruction procedure. PML reconstruction is used in both the indirect and direct reconstruction methods. We use a channelized Hotelling observer (CHO) to assess lesion detectability in Patlak parametric images. Simplified expressions for evaluating the lesion detectability have been derived and applied to the selection of the regularization parameter value to maximize detection performance. The proposed method is validated using computer-based Monte Carlo simulations. Good agreements between the theoretical predictions and the Monte Carlo results are observed. Both theoretical predictions and Monte Carlo simulation results show the benefit of the indirect and direct methods under optimized regularization parameters in dynamic PET reconstruction for lesion detection, when compared with the conventional static PET reconstruction.
Mapping grey matter reductions in schizophrenia: an anatomical likelihood estimation analysis of voxel-based morphometry studies.

PubMed

Fornito, A; Yücel, M; Patti, J; Wood, S J; Pantelis, C

2009-03-01

Voxel-based morphometry (VBM) is a popular tool for mapping neuroanatomical changes in schizophrenia patients. Several recent meta-analyses have identified the brain regions in which patients most consistently show grey matter reductions, although they have not examined whether such changes reflect differences in grey matter concentration (GMC) or grey matter volume (GMV). These measures assess different aspects of grey matter integrity, and may therefore reflect different pathological processes. In this study, we used the Anatomical Likelihood Estimation procedure to analyse significant differences reported in 37 VBM studies of schizophrenia patients, incorporating data from 1646 patients and 1690 controls, and compared the findings of studies using either GMC or GMV to index grey matter differences. Analysis of all studies combined indicated that grey matter reductions in a network of frontal, temporal, thalamic and striatal regions are among the most frequently reported in literature. GMC reductions were generally larger and more consistent than GMV reductions, and were more frequent in the insula, medial prefrontal, medial temporal and striatal regions. GMV reductions were more frequent in dorso-medial frontal cortex, and lateral and orbital frontal areas. These findings support the primacy of frontal, limbic, and subcortical dysfunction in the pathophysiology of schizophrenia, and suggest that the grey matter changes observed with MRI may not necessarily result from a unitary pathological process.
Outcome-Dependent Sampling with Interval-Censored Failure Time Data

PubMed Central

Zhou, Qingning; Cai, Jianwen; Zhou, Haibo

2017-01-01

Summary Epidemiologic studies and disease prevention trials often seek to relate an exposure variable to a failure time that suffers from interval-censoring. When the failure rate is low and the time intervals are wide, a large cohort is often required so as to yield reliable precision on the exposure-failure-time relationship. However, large cohort studies with simple random sampling could be prohibitive for investigators with a limited budget, especially when the exposure variables are expensive to obtain. Alternative cost-effective sampling designs and inference procedures are therefore desirable. We propose an outcome-dependent sampling (ODS) design with interval-censored failure time data, where we enrich the observed sample by selectively including certain more informative failure subjects. We develop a novel sieve semiparametric maximum empirical likelihood approach for fitting the proportional hazards model to data from the proposed interval-censoring ODS design. This approach employs the empirical likelihood and sieve methods to deal with the infinite-dimensional nuisance parameters, which greatly reduces the dimensionality of the estimation problem and eases the computation difficulty. The consistency and asymptotic normality of the resulting regression parameter estimator are established. The results from our extensive simulation study show that the proposed design and method works well for practical situations and is more efficient than the alternative designs and competing approaches. An example from the Atherosclerosis Risk in Communities (ARIC) study is provided for illustration. PMID:28771664
Generalized linear mixed models with varying coefficients for longitudinal data.

PubMed

Zhang, Daowen

2004-03-01

The routinely assumed parametric functional form in the linear predictor of a generalized linear mixed model for longitudinal data may be too restrictive to represent true underlying covariate effects. We relax this assumption by representing these covariate effects by smooth but otherwise arbitrary functions of time, with random effects used to model the correlation induced by among-subject and within-subject variation. Due to the usually intractable integration involved in evaluating the quasi-likelihood function, the double penalized quasi-likelihood (DPQL) approach of Lin and Zhang (1999, Journal of the Royal Statistical Society, Series B61, 381-400) is used to estimate the varying coefficients and the variance components simultaneously by representing a nonparametric function by a linear combination of fixed effects and random effects. A scaled chi-squared test based on the mixed model representation of the proposed model is developed to test whether an underlying varying coefficient is a polynomial of certain degree. We evaluate the performance of the procedures through simulation studies and illustrate their application with Indonesian children infectious disease data.
Information-theoretic model selection for optimal prediction of stochastic dynamical systems from data

NASA Astrophysics Data System (ADS)

Darmon, David

2018-03-01

In the absence of mechanistic or phenomenological models of real-world systems, data-driven models become necessary. The discovery of various embedding theorems in the 1980s and 1990s motivated a powerful set of tools for analyzing deterministic dynamical systems via delay-coordinate embeddings of observations of their component states. However, in many branches of science, the condition of operational determinism is not satisfied, and stochastic models must be brought to bear. For such stochastic models, the tool set developed for delay-coordinate embedding is no longer appropriate, and a new toolkit must be developed. We present an information-theoretic criterion, the negative log-predictive likelihood, for selecting the embedding dimension for a predictively optimal data-driven model of a stochastic dynamical system. We develop a nonparametric estimator for the negative log-predictive likelihood and compare its performance to a recently proposed criterion based on active information storage. Finally, we show how the output of the model selection procedure can be used to compare candidate predictors for a stochastic system to an information-theoretic lower bound.
A Likelihood-Based Framework for Association Analysis of Allele-Specific Copy Numbers.

PubMed

Hu, Y J; Lin, D Y; Sun, W; Zeng, D

2014-10-01

Copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) co-exist throughout the human genome and jointly contribute to phenotypic variations. Thus, it is desirable to consider both types of variants, as characterized by allele-specific copy numbers (ASCNs), in association studies of complex human diseases. Current SNP genotyping technologies capture the CNV and SNP information simultaneously via fluorescent intensity measurements. The common practice of calling ASCNs from the intensity measurements and then using the ASCN calls in downstream association analysis has important limitations. First, the association tests are prone to false-positive findings when differential measurement errors between cases and controls arise from differences in DNA quality or handling. Second, the uncertainties in the ASCN calls are ignored. We present a general framework for the integrated analysis of CNVs and SNPs, including the analysis of total copy numbers as a special case. Our approach combines the ASCN calling and the association analysis into a single step while allowing for differential measurement errors. We construct likelihood functions that properly account for case-control sampling and measurement errors. We establish the asymptotic properties of the maximum likelihood estimators and develop EM algorithms to implement the corresponding inference procedures. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to a genome-wide association study of schizophrenia. Extensions to next-generation sequencing data are discussed.
The Maximum Likelihood Solution for Inclination-only Data

NASA Astrophysics Data System (ADS)

Arason, P.; Levi, S.

2006-12-01

The arithmetic means of inclination-only data are known to introduce a shallowing bias. Several methods have been proposed to estimate unbiased means of the inclination along with measures of the precision. Most of the inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all these methods require various assumptions and approximations that are inappropriate for many data sets. For some steep and dispersed data sets, the estimates provided by these methods are significantly displaced from the peak of the likelihood function to systematically shallower inclinations. The problem in locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest. This is because some elements of the log-likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study we succeeded in analytically cancelling exponential elements from the likelihood function, and we are now able to calculate its value for any location in the parameter space and for any inclination-only data set, with full accuracy. Furtermore, we can now calculate the partial derivatives of the likelihood function with desired accuracy. Locating the maximum likelihood without the assumptions required by previous methods is now straight forward. The information to separate the mean inclination from the precision parameter will be lost for very steep and dispersed data sets. It is worth noting that the likelihood function always has a maximum value. However, for some dispersed and steep data sets with few samples, the likelihood function takes its highest value on the boundary of the parameter space, i.e. at inclinations of +/- 90 degrees, but with relatively well defined dispersion. Our simulations indicate that this occurs quite frequently for certain data sets, and relatively small perturbations in the data will drive the maxima to the boundary. We interpret this to indicate that, for such data sets, the information needed to separate the mean inclination and the precision parameter is permanently lost. To assess the reliability and accuracy of our method we generated large number of random Fisher-distributed data sets and used seven methods to estimate the mean inclination and precision paramenter. These comparisons are described by Levi and Arason at the 2006 AGU Fall meeting. The results of the various methods is very favourable to our new robust maximum likelihood method, which, on average, is the most reliable, and the mean inclination estimates are the least biased toward shallow values. Further information on our inclination-only analysis can be obtained from: http://www.vedur.is/~arason/paleomag
Occupancy Modeling Species-Environment Relationships with Non-ignorable Survey Designs.

PubMed

Irvine, Kathryn M; Rodhouse, Thomas J; Wright, Wilson J; Olsen, Anthony R

2018-05-26

Statistical models supporting inferences about species occurrence patterns in relation to environmental gradients are fundamental to ecology and conservation biology. A common implicit assumption is that the sampling design is ignorable and does not need to be formally accounted for in analyses. The analyst assumes data are representative of the desired population and statistical modeling proceeds. However, if datasets from probability and non-probability surveys are combined or unequal selection probabilities are used, the design may be non ignorable. We outline the use of pseudo-maximum likelihood estimation for site-occupancy models to account for such non-ignorable survey designs. This estimation method accounts for the survey design by properly weighting the pseudo-likelihood equation. In our empirical example, legacy and newer randomly selected locations were surveyed for bats to bridge a historic statewide effort with an ongoing nationwide program. We provide a worked example using bat acoustic detection/non-detection data and show how analysts can diagnose whether their design is ignorable. Using simulations we assessed whether our approach is viable for modeling datasets composed of sites contributed outside of a probability design Pseudo-maximum likelihood estimates differed from the usual maximum likelihood occu31 pancy estimates for some bat species. Using simulations we show the maximum likelihood estimator of species-environment relationships with non-ignorable sampling designs was biased, whereas the pseudo-likelihood estimator was design-unbiased. However, in our simulation study the designs composed of a large proportion of legacy or non-probability sites resulted in estimation issues for standard errors. These issues were likely a result of highly variable weights confounded by small sample sizes (5% or 10% sampling intensity and 4 revisits). Aggregating datasets from multiple sources logically supports larger sample sizes and potentially increases spatial extents for statistical inferences. Our results suggest that ignoring the mechanism for how locations were selected for data collection (e.g., the sampling design) could result in erroneous model-based conclusions. Therefore, in order to ensure robust and defensible recommendations for evidence-based conservation decision-making, the survey design information in addition to the data themselves must be available for analysts. Details for constructing the weights used in estimation and code for implementation are provided. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Collinear Latent Variables in Multilevel Confirmatory Factor Analysis

PubMed Central

van de Schoot, Rens; Hox, Joop

2014-01-01

Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the influence of the size of the intraclass correlation coefficient (ICC) and estimation method; maximum likelihood estimation with robust chi-squares and standard errors and Bayesian estimation, on the convergence rate are investigated. The other variables of interest were rate of inadmissible solutions and the relative parameter and standard error bias on the between level. The results showed that inadmissible solutions were obtained when there was between level collinearity and the estimation method was maximum likelihood. In the within level multicollinearity condition, all of the solutions were admissible but the bias values were higher compared with the between level collinearity condition. Bayesian estimation appeared to be robust in obtaining admissible parameters but the relative bias was higher than for maximum likelihood estimation. Finally, as expected, high ICC produced less biased results compared to medium ICC conditions. PMID:29795827
How much to trust the senses: Likelihood learning

PubMed Central

Sato, Yoshiyuki; Kording, Konrad P.

2014-01-01

Our brain often needs to estimate unknown variables from imperfect information. Our knowledge about the statistical distributions of quantities in our environment (called priors) and currently available information from sensory inputs (called likelihood) are the basis of all Bayesian models of perception and action. While we know that priors are learned, most studies of prior-likelihood integration simply assume that subjects know about the likelihood. However, as the quality of sensory inputs change over time, we also need to learn about new likelihoods. Here, we show that human subjects readily learn the distribution of visual cues (likelihood function) in a way that can be predicted by models of statistically optimal learning. Using a likelihood that depended on color context, we found that a learned likelihood generalized to new priors. Thus, we conclude that subjects learn about likelihood. PMID:25398975
Novel scheme for rapid parallel parameter estimation of gravitational waves from compact binary coalescences

NASA Astrophysics Data System (ADS)

Pankow, C.; Brady, P.; Ochsner, E.; O'Shaughnessy, R.

2015-07-01

We introduce a highly parallelizable architecture for estimating parameters of compact binary coalescence using gravitational-wave data and waveform models. Using a spherical harmonic mode decomposition, the waveform is expressed as a sum over modes that depend on the intrinsic parameters (e.g., masses) with coefficients that depend on the observer dependent extrinsic parameters (e.g., distance, sky position). The data is then prefiltered against those modes, at fixed intrinsic parameters, enabling efficiently evaluation of the likelihood for generic source positions and orientations, independent of waveform length or generation time. We efficiently parallelize our intrinsic space calculation by integrating over all extrinsic parameters using a Monte Carlo integration strategy. Since the waveform generation and prefiltering happens only once, the cost of integration dominates the procedure. Also, we operate hierarchically, using information from existing gravitational-wave searches to identify the regions of parameter space to emphasize in our sampling. As proof of concept and verification of the result, we have implemented this algorithm using standard time-domain waveforms, processing each event in less than one hour on recent computing hardware. For most events we evaluate the marginalized likelihood (evidence) with statistical errors of ≲5 %, and even smaller in many cases. With a bounded runtime independent of the waveform model starting frequency, a nearly unchanged strategy could estimate neutron star (NS)-NS parameters in the 2018 advanced LIGO era. Our algorithm is usable with any noise curve and existing time-domain model at any mass, including some waveforms which are computationally costly to evolve.
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

ERIC Educational Resources Information Center

Paek, Insu

2012-01-01

Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Parameter estimation method that directly compares gravitational wave observations to numerical relativity

NASA Astrophysics Data System (ADS)

Lange, J.; O'Shaughnessy, R.; Boyle, M.; Calderón Bustillo, J.; Campanelli, M.; Chu, T.; Clark, J. A.; Demos, N.; Fong, H.; Healy, J.; Hemberger, D. A.; Hinder, I.; Jani, K.; Khamesra, B.; Kidder, L. E.; Kumar, P.; Laguna, P.; Lousto, C. O.; Lovelace, G.; Ossokine, S.; Pfeiffer, H.; Scheel, M. A.; Shoemaker, D. M.; Szilagyi, B.; Teukolsky, S.; Zlochower, Y.

2017-11-01

We present and assess a Bayesian method to interpret gravitational wave signals from binary black holes. Our method directly compares gravitational wave data to numerical relativity (NR) simulations. In this study, we present a detailed investigation of the systematic and statistical parameter estimation errors of this method. This procedure bypasses approximations used in semianalytical models for compact binary coalescence. In this work, we use the full posterior parameter distribution for only generic nonprecessing binaries, drawing inferences away from the set of NR simulations used, via interpolation of a single scalar quantity (the marginalized log likelihood, ln L ) evaluated by comparing data to nonprecessing binary black hole simulations. We also compare the data to generic simulations, and discuss the effectiveness of this procedure for generic sources. We specifically assess the impact of higher order modes, repeating our interpretation with both l ≤2 as well as l ≤3 harmonic modes. Using the l ≤3 higher modes, we gain more information from the signal and can better constrain the parameters of the gravitational wave signal. We assess and quantify several sources of systematic error that our procedure could introduce, including simulation resolution and duration; most are negligible. We show through examples that our method can recover the parameters for equal mass, zero spin, GW150914-like, and unequal mass, precessing spin sources. Our study of this new parameter estimation method demonstrates that we can quantify and understand the systematic and statistical error. This method allows us to use higher order modes from numerical relativity simulations to better constrain the black hole binary parameters.
Time series segmentation: a new approach based on Genetic Algorithm and Hidden Markov Model

NASA Astrophysics Data System (ADS)

Toreti, A.; Kuglitsch, F. G.; Xoplaki, E.; Luterbacher, J.

2009-04-01

The subdivision of a time series into homogeneous segments has been performed using various methods applied to different disciplines. In climatology, for example, it is accompanied by the well-known homogenization problem and the detection of artificial change points. In this context, we present a new method (GAMM) based on Hidden Markov Model (HMM) and Genetic Algorithm (GA), applicable to series of independent observations (and easily adaptable to autoregressive processes). A left-to-right hidden Markov model, estimating the parameters and the best-state sequence, respectively, with the Baum-Welch and Viterbi algorithms, was applied. In order to avoid the well-known dependence of the Baum-Welch algorithm on the initial condition, a Genetic Algorithm was developed. This algorithm is characterized by mutation, elitism and a crossover procedure implemented with some restrictive rules. Moreover the function to be minimized was derived following the approach of Kehagias (2004), i.e. it is the so-called complete log-likelihood. The number of states was determined applying a two-fold cross-validation procedure (Celeux and Durand, 2008). Being aware that the last issue is complex, and it influences all the analysis, a Multi Response Permutation Procedure (MRPP; Mielke et al., 1981) was inserted. It tests the model with K+1 states (where K is the state number of the best model) if its likelihood is close to K-state model. Finally, an evaluation of the GAMM performances, applied as a break detection method in the field of climate time series homogenization, is shown. 1. G. Celeux and J.B. Durand, Comput Stat 2008. 2. A. Kehagias, Stoch Envir Res 2004. 3. P.W. Mielke, K.J. Berry, G.W. Brier, Monthly Wea Rev 1981.

Utility and Safety of Endoscopic Ultrasound With Bronchoscope-Guided Fine-Needle Aspiration in Mediastinal Lymph Node Sampling: Systematic Review and Meta-Analysis.

PubMed

Dhooria, Sahajal; Aggarwal, Ashutosh N; Gupta, Dheeraj; Behera, Digambar; Agarwal, Ritesh

2015-07-01

The use of endoscopic ultrasound with bronchoscope-guided fine-needle aspiration (EUS-B-FNA) has been described in the evaluation of mediastinal lymphadenopathy. Herein, we conduct a meta-analysis to estimate the overall diagnostic yield and safety of EUS-B-FNA combined with endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA), in the diagnosis of mediastinal lymphadenopathy. The PubMed and EmBase databases were searched for studies reporting the outcomes of EUS-B-FNA in diagnosis of mediastinal lymphadenopathy. The study quality was assessed using the QualSyst tool. The yield of EBUS-TBNA alone and the combined procedure (EBUS-TBNA and EUS-B-FNA) were analyzed by calculating the sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio for each study, and pooling the study results using a random effects model. Heterogeneity and publication bias were assessed for individual outcomes. The additional diagnostic gain of EUS-B-FNA over EBUS-TBNA was calculated using proportion meta-analysis. Our search yielded 10 studies (1,080 subjects with mediastinal lymphadenopathy). The sensitivity of the combined procedure was significantly higher than EBUS-TBNA alone (91% vs 80%, P = .004), in staging of lung cancer (4 studies, 465 subjects). The additional diagnostic gain of EUS-B-FNA over EBUS-TBNA was 7.6% in the diagnosis of mediastinal adenopathy. No serious complication of EUS-B-FNA procedure was reported. Clinical and statistical heterogeneity was present without any evidence of publication bias. Combining EBUS-TBNA and EUS-B-FNA is an effective and safe method, superior to EBUS-TBNA alone, in the diagnosis of mediastinal lymphadenopathy. Good quality randomized controlled trials are required to confirm the results of this systematic review. Copyright © 2015 by Daedalus Enterprises.
A Penalized Likelihood Framework For High-Dimensional Phylogenetic Comparative Methods And An Application To New-World Monkeys Brain Evolution.

PubMed

Julien, Clavel; Leandro, Aristide; Hélène, Morlon

2018-06-19

Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
On Nonequivalence of Several Procedures of Structural Equation Modeling

ERIC Educational Resources Information Center

Yuan, Ke-Hai; Chan, Wai

2005-01-01

The normal theory based maximum likelihood procedure is widely used in structural equation modeling. Three alternatives are: the normal theory based generalized least squares, the normal theory based iteratively reweighted least squares, and the asymptotically distribution-free procedure. When data are normally distributed and the model structure…
Regression analysis of informative current status data with the additive hazards model.

PubMed

Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo

2015-04-01

This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.
Decomposition of conditional probability for high-order symbolic Markov chains.

PubMed

Melnik, S S; Usatenko, O V

2017-07-01

The main goal of this paper is to develop an estimate for the conditional probability function of random stationary ergodic symbolic sequences with elements belonging to a finite alphabet. We elaborate on a decomposition procedure for the conditional probability function of sequences considered to be high-order Markov chains. We represent the conditional probability function as the sum of multilinear memory function monomials of different orders (from zero up to the chain order). This allows us to introduce a family of Markov chain models and to construct artificial sequences via a method of successive iterations, taking into account at each step increasingly high correlations among random elements. At weak correlations, the memory functions are uniquely expressed in terms of the high-order symbolic correlation functions. The proposed method fills the gap between two approaches, namely the likelihood estimation and the additive Markov chains. The obtained results may have applications for sequential approximation of artificial neural network training.
Decomposition of conditional probability for high-order symbolic Markov chains

NASA Astrophysics Data System (ADS)

Melnik, S. S.; Usatenko, O. V.

2017-07-01

The main goal of this paper is to develop an estimate for the conditional probability function of random stationary ergodic symbolic sequences with elements belonging to a finite alphabet. We elaborate on a decomposition procedure for the conditional probability function of sequences considered to be high-order Markov chains. We represent the conditional probability function as the sum of multilinear memory function monomials of different orders (from zero up to the chain order). This allows us to introduce a family of Markov chain models and to construct artificial sequences via a method of successive iterations, taking into account at each step increasingly high correlations among random elements. At weak correlations, the memory functions are uniquely expressed in terms of the high-order symbolic correlation functions. The proposed method fills the gap between two approaches, namely the likelihood estimation and the additive Markov chains. The obtained results may have applications for sequential approximation of artificial neural network training.
Method and system for diagnostics of apparatus

NASA Technical Reports Server (NTRS)

Gorinevsky, Dimitry (Inventor)

2012-01-01

Proposed is a method, implemented in software, for estimating fault state of an apparatus outfitted with sensors. At each execution period the method processes sensor data from the apparatus to obtain a set of parity parameters, which are further used for estimating fault state. The estimation method formulates a convex optimization problem for each fault hypothesis and employs a convex solver to compute fault parameter estimates and fault likelihoods for each fault hypothesis. The highest likelihoods and corresponding parameter estimates are transmitted to a display device or an automated decision and control system. The obtained accurate estimate of fault state can be used to improve safety, performance, or maintenance processes for the apparatus.
dPIRPLE: a joint estimation framework for deformable registration and penalized-likelihood CT image reconstruction using prior images

NASA Astrophysics Data System (ADS)

Dang, H.; Wang, A. S.; Sussman, Marc S.; Siewerdsen, J. H.; Stayman, J. W.

2014-09-01

Sequential imaging studies are conducted in many clinical scenarios. Prior images from previous studies contain a great deal of patient-specific anatomical information and can be used in conjunction with subsequent imaging acquisitions to maintain image quality while enabling radiation dose reduction (e.g., through sparse angular sampling, reduction in fluence, etc). However, patient motion between images in such sequences results in misregistration between the prior image and current anatomy. Existing prior-image-based approaches often include only a simple rigid registration step that can be insufficient for capturing complex anatomical motion, introducing detrimental effects in subsequent image reconstruction. In this work, we propose a joint framework that estimates the 3D deformation between an unregistered prior image and the current anatomy (based on a subsequent data acquisition) and reconstructs the current anatomical image using a model-based reconstruction approach that includes regularization based on the deformed prior image. This framework is referred to as deformable prior image registration, penalized-likelihood estimation (dPIRPLE). Central to this framework is the inclusion of a 3D B-spline-based free-form-deformation model into the joint registration-reconstruction objective function. The proposed framework is solved using a maximization strategy whereby alternating updates to the registration parameters and image estimates are applied allowing for improvements in both the registration and reconstruction throughout the optimization process. Cadaver experiments were conducted on a cone-beam CT testbench emulating a lung nodule surveillance scenario. Superior reconstruction accuracy and image quality were demonstrated using the dPIRPLE algorithm as compared to more traditional reconstruction methods including filtered backprojection, penalized-likelihood estimation (PLE), prior image penalized-likelihood estimation (PIPLE) without registration, and prior image penalized-likelihood estimation with rigid registration of a prior image (PIRPLE) over a wide range of sampling sparsity and exposure levels.
Likelihood ratios for glaucoma diagnosis using spectral-domain optical coherence tomography.

PubMed

Lisboa, Renato; Mansouri, Kaweh; Zangwill, Linda M; Weinreb, Robert N; Medeiros, Felipe A

2013-11-01

To present a methodology for calculating likelihood ratios for glaucoma diagnosis for continuous retinal nerve fiber layer (RNFL) thickness measurements from spectral-domain optical coherence tomography (spectral-domain OCT). Observational cohort study. A total of 262 eyes of 187 patients with glaucoma and 190 eyes of 100 control subjects were included in the study. Subjects were recruited from the Diagnostic Innovations Glaucoma Study. Eyes with preperimetric and perimetric glaucomatous damage were included in the glaucoma group. The control group was composed of healthy eyes with normal visual fields from subjects recruited from the general population. All eyes underwent RNFL imaging with Spectralis spectral-domain OCT. Likelihood ratios for glaucoma diagnosis were estimated for specific global RNFL thickness measurements using a methodology based on estimating the tangents to the receiver operating characteristic (ROC) curve. Likelihood ratios could be determined for continuous values of average RNFL thickness. Average RNFL thickness values lower than 86 μm were associated with positive likelihood ratios (ie, likelihood ratios greater than 1), whereas RNFL thickness values higher than 86 μm were associated with negative likelihood ratios (ie, likelihood ratios smaller than 1). A modified Fagan nomogram was provided to assist calculation of posttest probability of disease from the calculated likelihood ratios and pretest probability of disease. The methodology allowed calculation of likelihood ratios for specific RNFL thickness values. By avoiding arbitrary categorization of test results, it potentially allows for an improved integration of test results into diagnostic clinical decision making. Copyright © 2013. Published by Elsevier Inc.
Multimodal Likelihoods in Educational Assessment: Will the Real Maximum Likelihood Score Please Stand up?

ERIC Educational Resources Information Center

Wothke, Werner; Burket, George; Chen, Li-Sue; Gao, Furong; Shu, Lianghua; Chia, Mike

2011-01-01

It has been known for some time that item response theory (IRT) models may exhibit a likelihood function of a respondent's ability which may have multiple modes, flat modes, or both. These conditions, often associated with guessing of multiple-choice (MC) questions, can introduce uncertainty and bias to ability estimation by maximum likelihood…
F-8C adaptive flight control extensions. [for maximum likelihood estimation

NASA Technical Reports Server (NTRS)

Stein, G.; Hartmann, G. L.

1977-01-01

An adaptive concept which combines gain-scheduled control laws with explicit maximum likelihood estimation (MLE) identification to provide the scheduling values is described. The MLE algorithm was improved by incorporating attitude data, estimating gust statistics for setting filter gains, and improving parameter tracking during changing flight conditions. A lateral MLE algorithm was designed to improve true air speed and angle of attack estimates during lateral maneuvers. Relationships between the pitch axis sensors inherent in the MLE design were examined and used for sensor failure detection. Design details and simulation performance are presented for each of the three areas investigated.
Quantum state estimation when qubits are lost: a no-data-left-behind approach

DOE PAGES

Williams, Brian P.; Lougovski, Pavel

2017-04-06

We present an approach to Bayesian mean estimation of quantum states using hyperspherical parametrization and an experiment-specific likelihood which allows utilization of all available data, even when qubits are lost. With this method, we report the first closed-form Bayesian mean and maximum likelihood estimates for the ideal single qubit. Due to computational constraints, we utilize numerical sampling to determine the Bayesian mean estimate for a photonic two-qubit experiment in which our novel analysis reduces burdens associated with experimental asymmetries and inefficiencies. This method can be applied to quantum states of any dimension and experimental complexity.
Estimation of Dynamic Discrete Choice Models by Maximum Likelihood and the Simulated Method of Moments

PubMed Central

Eisenhauer, Philipp; Heckman, James J.; Mosso, Stefano

2015-01-01

We compare the performance of maximum likelihood (ML) and simulated method of moments (SMM) estimation for dynamic discrete choice models. We construct and estimate a simplified dynamic structural model of education that captures some basic features of educational choices in the United States in the 1980s and early 1990s. We use estimates from our model to simulate a synthetic dataset and assess the ability of ML and SMM to recover the model parameters on this sample. We investigate the performance of alternative tuning parameters for SMM. PMID:26494926
Calibration of two complex ecosystem models with different likelihood functions

NASA Astrophysics Data System (ADS)

Hidy, Dóra; Haszpra, László; Pintér, Krisztina; Nagy, Zoltán; Barcza, Zoltán

2014-05-01

The biosphere is a sensitive carbon reservoir. Terrestrial ecosystems were approximately carbon neutral during the past centuries, but they became net carbon sinks due to climate change induced environmental change and associated CO2 fertilization effect of the atmosphere. Model studies and measurements indicate that the biospheric carbon sink can saturate in the future due to ongoing climate change which can act as a positive feedback. Robustness of carbon cycle models is a key issue when trying to choose the appropriate model for decision support. The input parameters of the process-based models are decisive regarding the model output. At the same time there are several input parameters for which accurate values are hard to obtain directly from experiments or no local measurements are available. Due to the uncertainty associated with the unknown model parameters significant bias can be experienced if the model is used to simulate the carbon and nitrogen cycle components of different ecosystems. In order to improve model performance the unknown model parameters has to be estimated. We developed a multi-objective, two-step calibration method based on Bayesian approach in order to estimate the unknown parameters of PaSim and Biome-BGC models. Biome-BGC and PaSim are a widely used biogeochemical models that simulate the storage and flux of water, carbon, and nitrogen between the ecosystem and the atmosphere, and within the components of the terrestrial ecosystems (in this research the developed version of Biome-BGC is used which is referred as BBGC MuSo). Both models were calibrated regardless the simulated processes and type of model parameters. The calibration procedure is based on the comparison of measured data with simulated results via calculating a likelihood function (degree of goodness-of-fit between simulated and measured data). In our research different likelihood function formulations were used in order to examine the effect of the different model goodness metric on calibration. The different likelihoods are different functions of RMSE (root mean squared error) weighted by measurement uncertainty: exponential / linear / quadratic / linear normalized by correlation. As a first calibration step sensitivity analysis was performed in order to select the influential parameters which have strong effect on the output data. In the second calibration step only the sensitive parameters were calibrated (optimal values and confidence intervals were calculated). In case of PaSim more parameters were found responsible for the 95% of the output data variance than is case of BBGC MuSo. Analysis of the results of the optimized models revealed that the exponential likelihood estimation proved to be the most robust (best model simulation with optimized parameter, highest confidence interval increase). The cross-validation of the model simulations can help in constraining the highly uncertain greenhouse gas budget of grasslands.
Semiparametric Bayesian analysis of gene-environment interactions with error in measurement of environmental covariates and missing genetic data.

PubMed

Lobach, Iryna; Mallick, Bani; Carroll, Raymond J

2011-01-01

Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.
Lindley frailty model for a class of compound Poisson processes

NASA Astrophysics Data System (ADS)

Kadilar, Gamze Özel; Ata, Nihal

2013-10-01

The Lindley distribution gain importance in survival analysis for the similarity of exponential distribution and allowance for the different shapes of hazard function. Frailty models provide an alternative to proportional hazards model where misspecified or omitted covariates are described by an unobservable random variable. Despite of the distribution of the frailty is generally assumed to be continuous, it is appropriate to consider discrete frailty distributions In some circumstances. In this paper, frailty models with discrete compound Poisson process for the Lindley distributed failure time are introduced. Survival functions are derived and maximum likelihood estimation procedures for the parameters are studied. Then, the fit of the models to the earthquake data set of Turkey are examined.
ANALYZING COHORT MORTALITY DATA

EPA Science Inventory

Several methods for analyzing data from mortality studies of occupationally or environmentally exposed cohorts are shown to be special cases of a single procedure. The procedure assumes a proportional hazards model for exposure effects and represents the log-likelihood kernel for...
On the Existence and Uniqueness of JML Estimates for the Partial Credit Model

ERIC Educational Resources Information Center

Bertoli-Barsotti, Lucio

2005-01-01

A necessary and sufficient condition is given in this paper for the existence and uniqueness of the maximum likelihood (the so-called joint maximum likelihood) estimate of the parameters of the Partial Credit Model. This condition is stated in terms of a structural property of the pattern of the data matrix that can be easily verified on the basis…
Use of Bayes theorem to correct size-specific sampling bias in growth data.

PubMed

Troynikov, V S

1999-03-01

The bayesian decomposition of posterior distribution was used to develop a likelihood function to correct bias in the estimates of population parameters from data collected randomly with size-specific selectivity. Positive distributions with time as a parameter were used for parametrization of growth data. Numerical illustrations are provided. The alternative applications of the likelihood to estimate selectivity parameters are discussed.
ATAC Autocuer Modeling Analysis.

DTIC Science & Technology

1981-01-01

the analysis of the simple rectangular scrnentation (1) is based on detection and estimation theory (2). This approach uses the concept of maximum ...continuous wave forms. In order to develop the principles of maximum likelihood, it is con- venient to develop the principles for the "classical...the concept of maximum likelihood is significant in that it provides the optimum performance of the detection/estimation problem. With a knowledge of

Quantifying the Establishment Likelihood of Invasive Alien Species Introductions Through Ports with Application to Honeybees in Australia.

PubMed

Heersink, Daniel K; Caley, Peter; Paini, Dean R; Barry, Simon C

2016-05-01

The cost of an uncontrolled incursion of invasive alien species (IAS) arising from undetected entry through ports can be substantial, and knowledge of port-specific risks is needed to help allocate limited surveillance resources. Quantifying the establishment likelihood of such an incursion requires quantifying the ability of a species to enter, establish, and spread. Estimation of the approach rate of IAS into ports provides a measure of likelihood of entry. Data on the approach rate of IAS are typically sparse, and the combinations of risk factors relating to country of origin and port of arrival diverse. This presents challenges to making formal statistical inference on establishment likelihood. Here we demonstrate how these challenges can be overcome with judicious use of mixed-effects models when estimating the incursion likelihood into Australia of the European (Apis mellifera) and Asian (A. cerana) honeybees, along with the invasive parasites of biosecurity concern they host (e.g., Varroa destructor). Our results demonstrate how skewed the establishment likelihood is, with one-tenth of the ports accounting for 80% or more of the likelihood for both species. These results have been utilized by biosecurity agencies in the allocation of resources to the surveillance of maritime ports. © 2015 Society for Risk Analysis.
An evaluation of portion size estimation aids: precision, ease of use and likelihood of future use.

PubMed

Faulkner, Gemma P; Livingstone, M Barbara E; Pourshahidi, L Kirsty; Spence, Michelle; Dean, Moira; O'Brien, Sinead; Gibney, Eileen R; Wallace, Julie Mw; McCaffrey, Tracy A; Kerr, Maeve A

2016-09-01

The present study aimed to evaluate the precision, ease of use and likelihood of future use of portion size estimation aids (PSEA). A range of PSEA were used to estimate the serving sizes of a range of commonly eaten foods and rated for ease of use and likelihood of future usage. For each food, participants selected their preferred PSEA from a range of options including: quantities and measures; reference objects; measuring; and indicators on food packets. These PSEA were used to serve out various foods (e.g. liquid, amorphous, and composite dishes). Ease of use and likelihood of future use were noted. The foods were weighed to determine the precision of each PSEA. Males and females aged 18-64 years (n 120). The quantities and measures were the most precise PSEA (lowest range of weights for estimated portion sizes). However, participants preferred household measures (e.g. 200 ml disposable cup) - deemed easy to use (median rating of 5), likely to use again in future (all scored either 4 or 5 on a scale from 1='not very likely' to 5='very likely to use again') and precise (narrow range of weights for estimated portion sizes). The majority indicated they would most likely use the PSEA preparing a meal (94 %), particularly dinner (86 %) in the home (89 %; all P<0·001) for amorphous grain foods. Household measures may be precise, easy to use and acceptable aids for estimating the appropriate portion size of amorphous grain foods.
Empirical likelihood inference in randomized clinical trials.

PubMed

Zhang, Biao

2017-01-01

In individually randomized controlled trials, in addition to the primary outcome, information is often available on a number of covariates prior to randomization. This information is frequently utilized to undertake adjustment for baseline characteristics in order to increase precision of the estimation of average treatment effects; such adjustment is usually performed via covariate adjustment in outcome regression models. Although the use of covariate adjustment is widely seen as desirable for making treatment effect estimates more precise and the corresponding hypothesis tests more powerful, there are considerable concerns that objective inference in randomized clinical trials can potentially be compromised. In this paper, we study an empirical likelihood approach to covariate adjustment and propose two unbiased estimating functions that automatically decouple evaluation of average treatment effects from regression modeling of covariate-outcome relationships. The resulting empirical likelihood estimator of the average treatment effect is as efficient as the existing efficient adjusted estimators 1 when separate treatment-specific working regression models are correctly specified, yet are at least as efficient as the existing efficient adjusted estimators 1 for any given treatment-specific working regression models whether or not they coincide with the true treatment-specific covariate-outcome relationships. We present a simulation study to compare the finite sample performance of various methods along with some results on analysis of a data set from an HIV clinical trial. The simulation results indicate that the proposed empirical likelihood approach is more efficient and powerful than its competitors when the working covariate-outcome relationships by treatment status are misspecified.
Gaussian Decomposition of Laser Altimeter Waveforms

NASA Technical Reports Server (NTRS)

Hofton, Michelle A.; Minster, J. Bernard; Blair, J. Bryan

1999-01-01

We develop a method to decompose a laser altimeter return waveform into its Gaussian components assuming that the position of each Gaussian within the waveform can be used to calculate the mean elevation of a specific reflecting surface within the laser footprint. We estimate the number of Gaussian components from the number of inflection points of a smoothed copy of the laser waveform, and obtain initial estimates of the Gaussian half-widths and positions from the positions of its consecutive inflection points. Initial amplitude estimates are obtained using a non-negative least-squares method. To reduce the likelihood of fitting the background noise within the waveform and to minimize the number of Gaussians needed in the approximation, we rank the "importance" of each Gaussian in the decomposition using its initial half-width and amplitude estimates. The initial parameter estimates of all Gaussians ranked "important" are optimized using the Levenburg-Marquardt method. If the sum of the Gaussians does not approximate the return waveform to a prescribed accuracy, then additional Gaussians are included in the optimization procedure. The Gaussian decomposition method is demonstrated on data collected by the airborne Laser Vegetation Imaging Sensor (LVIS) in October 1997 over the Sequoia National Forest, California.
Development of advanced techniques for rotorcraft state estimation and parameter identification

NASA Technical Reports Server (NTRS)

Hall, W. E., Jr.; Bohn, J. G.; Vincent, J. H.

1980-01-01

An integrated methodology for rotorcraft system identification consists of rotorcraft mathematical modeling, three distinct data processing steps, and a technique for designing inputs to improve the identifiability of the data. These elements are as follows: (1) a Kalman filter smoother algorithm which estimates states and sensor errors from error corrupted data. Gust time histories and statistics may also be estimated; (2) a model structure estimation algorithm for isolating a model which adequately explains the data; (3) a maximum likelihood algorithm for estimating the parameters and estimates for the variance of these estimates; and (4) an input design algorithm, based on a maximum likelihood approach, which provides inputs to improve the accuracy of parameter estimates. Each step is discussed with examples to both flight and simulated data cases.
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.

PubMed

Xie, Yanmei; Zhang, Biao

2017-04-20

Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
A comparison of two indices for the intraclass correlation coefficient.

PubMed

Shieh, Gwowen

2012-12-01

In the present study, we examined the behavior of two indices for measuring the intraclass correlation in the one-way random effects model: the prevailing ICC(1) (Fisher, 1938) and the corrected eta-squared (Bliese & Halverson, 1998). These two procedures differ both in their methods of estimating the variance components that define the intraclass correlation coefficient and in their performance of bias and mean squared error in the estimation of the intraclass correlation coefficient. In contrast with the natural unbiased principle used to construct ICC(1), in the present study it was analytically shown that the corrected eta-squared estimator is identical to the maximum likelihood estimator and the pairwise estimator under equal group sizes. Moreover, the empirical results obtained from the present Monte Carlo simulation study across various group structures revealed the mutual dominance relationship between their truncated versions for negative values. The corrected eta-squared estimator performs better than the ICC(1) estimator when the underlying population intraclass correlation coefficient is small. Conversely, ICC(1) has a clear advantage over the corrected eta-squared for medium and large magnitudes of population intraclass correlation coefficient. The conceptual description and numerical investigation provide guidelines to help researchers choose between the two indices for more accurate reliability analysis in multilevel research.
Robust analysis of semiparametric renewal process models

PubMed Central

Lin, Feng-Chang; Truong, Young K.; Fine, Jason P.

2013-01-01

Summary A rate model is proposed for a modulated renewal process comprising a single long sequence, where the covariate process may not capture the dependencies in the sequence as in standard intensity models. We consider partial likelihood-based inferences under a semiparametric multiplicative rate model, which has been widely studied in the context of independent and identical data. Under an intensity model, gap times in a single long sequence may be used naively in the partial likelihood with variance estimation utilizing the observed information matrix. Under a rate model, the gap times cannot be treated as independent and studying the partial likelihood is much more challenging. We employ a mixing condition in the application of limit theory for stationary sequences to obtain consistency and asymptotic normality. The estimator's variance is quite complicated owing to the unknown gap times dependence structure. We adapt block bootstrapping and cluster variance estimators to the partial likelihood. Simulation studies and an analysis of a semiparametric extension of a popular model for neural spike train data demonstrate the practical utility of the rate approach in comparison with the intensity approach. PMID:24550568
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics.

PubMed

Arampatzis, Georgios; Katsoulakis, Markos A; Rey-Bellet, Luc

2016-03-14

We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systems with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.
A single-index threshold Cox proportional hazard model for identifying a treatment-sensitive subset based on multiple biomarkers.

PubMed

He, Ye; Lin, Huazhen; Tu, Dongsheng

2018-06-04

In this paper, we introduce a single-index threshold Cox proportional hazard model to select and combine biomarkers to identify patients who may be sensitive to a specific treatment. A penalized smoothed partial likelihood is proposed to estimate the parameters in the model. A simple, efficient, and unified algorithm is presented to maximize this likelihood function. The estimators based on this likelihood function are shown to be consistent and asymptotically normal. Under mild conditions, the proposed estimators also achieve the oracle property. The proposed approach is evaluated through simulation analyses and application to the analysis of data from two clinical trials, one involving patients with locally advanced or metastatic pancreatic cancer and one involving patients with resectable lung cancer. Copyright © 2018 John Wiley & Sons, Ltd.
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics

NASA Astrophysics Data System (ADS)

Arampatzis, Georgios; Katsoulakis, Markos A.; Rey-Bellet, Luc

2016-03-01

We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systems with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.
Efficient estimators for likelihood ratio sensitivity indices of complex stochastic dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arampatzis, Georgios; Katsoulakis, Markos A.; Rey-Bellet, Luc

2016-03-14

We demonstrate that centered likelihood ratio estimators for the sensitivity indices of complex stochastic dynamics are highly efficient with low, constant in time variance and consequently they are suitable for sensitivity analysis in long-time and steady-state regimes. These estimators rely on a new covariance formulation of the likelihood ratio that includes as a submatrix a Fisher information matrix for stochastic dynamics and can also be used for fast screening of insensitive parameters and parameter combinations. The proposed methods are applicable to broad classes of stochastic dynamics such as chemical reaction networks, Langevin-type equations and stochastic models in finance, including systemsmore » with a high dimensional parameter space and/or disparate decorrelation times between different observables. Furthermore, they are simple to implement as a standard observable in any existing simulation algorithm without additional modifications.« less
Estimating river discharge uncertainty by applying the Rating Curve Model

NASA Astrophysics Data System (ADS)

Barbetta, S.; Melone, F.; Franchini, M.; Moramarco, T.

2012-04-01

The knowledge of the flow discharge at a river site is necessary for planning and management of water resources as well as for monitoring and real-time forecasting purposes when significant flood events occur. In the hydrological practice, the operational discharge measurement in medium and large rivers is mostly based on indirect approaches by converting the observed stage into discharge values using steady-flow rating curves. However, the stage-discharge relationship can be unknown for hydrometric sections where flow velocity measurements, particularly during high floods, are not available. To overcome this issue, a simplified approach named Rating Curve Model (RCM) and proposed by Moramarco et al. (Moramarco, T., Barbetta, S., F. Melone, F. & Singh, V.P., Relating local stage and remote discharge with significant lateral inflow, J. Hydrol. Engng ASCE, 10[1], 58?69, 2005) can be conveniently used. RCM turned out able to assess, with a high level of accuracy, the discharge hydrograph at a river site where only the stage is monitored while the flow is recorded at a different section along the river, even when significant lateral flows occur. The simple structure of the model is depending on three parameters of which two can be considered characteristic of the river reach and one of the wave travel time of floods. Considering that RCM well lends itself to predict the stage-discharge relationship at a river site wherein only stages are recorded, an uncertainty analysis on river discharge estimate is of interest for the hydrological practice definitely. To this aim, the uncertainty characterizing the RCM outcomes is addressed in this work by considering two different procedures based on the Monte Carlo approach and the Generalized Likelihood Uncertainty Estimation (GLUE) method, respectively. The statistical distribution of parameters is found and a random re-sampling of parameters is done for assessing the 90% confidence interval (CI) of discharge estimates. In particular, for the latter approach the Nash-Sutcliffe coefficient is used as likelihood measure. Two equipped river reaches of the Upper-Middle Tiber River basin, central Italy, are investigated as case studies. The results provided by the selected methodologies are discussed and compared showing that all the computed CIs are satisfied in term of percentage of included observed discharges with similar percentages characterizing the bands assessed by both Monte Carlo approach and GLUE procedure.
Empirical likelihood based detection procedure for change point in mean residual life functions under random censorship.

PubMed

Chen, Ying-Ju; Ning, Wei; Gupta, Arjun K

2016-05-01

The mean residual life (MRL) function is one of the basic parameters of interest in survival analysis that describes the expected remaining time of an individual after a certain age. The study of changes in the MRL function is practical and interesting because it may help us to identify some factors such as age and gender that may influence the remaining lifetimes of patients after receiving a certain surgery. In this paper, we propose a detection procedure based on the empirical likelihood for the changes in MRL functions with right censored data. Two real examples are also given: Veterans' administration lung cancer study and Stanford heart transplant to illustrate the detecting procedure. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Approximate likelihood approaches for detecting the influence of primordial gravitational waves in cosmic microwave background polarization

NASA Astrophysics Data System (ADS)

Pan, Zhen; Anderes, Ethan; Knox, Lloyd

2018-05-01

One of the major targets for next-generation cosmic microwave background (CMB) experiments is the detection of the primordial B-mode signal. Planning is under way for Stage-IV experiments that are projected to have instrumental noise small enough to make lensing and foregrounds the dominant source of uncertainty for estimating the tensor-to-scalar ratio r from polarization maps. This makes delensing a crucial part of future CMB polarization science. In this paper we present a likelihood method for estimating the tensor-to-scalar ratio r from CMB polarization observations, which combines the benefits of a full-scale likelihood approach with the tractability of the quadratic delensing technique. This method is a pixel space, all order likelihood analysis of the quadratic delensed B modes, and it essentially builds upon the quadratic delenser by taking into account all order lensing and pixel space anomalies. Its tractability relies on a crucial factorization of the pixel space covariance matrix of the polarization observations which allows one to compute the full Gaussian approximate likelihood profile, as a function of r , at the same computational cost of a single likelihood evaluation.
Effect of formal and informal likelihood functions on uncertainty assessment in a single event rainfall-runoff model

NASA Astrophysics Data System (ADS)

Nourali, Mahrouz; Ghahraman, Bijan; Pourreza-Bilondi, Mohsen; Davary, Kamran

2016-09-01

In the present study, DREAM(ZS), Differential Evolution Adaptive Metropolis combined with both formal and informal likelihood functions, is used to investigate uncertainty of parameters of the HEC-HMS model in Tamar watershed, Golestan province, Iran. In order to assess the uncertainty of 24 parameters used in HMS, three flood events were used to calibrate and one flood event was used to validate the posterior distributions. Moreover, performance of seven different likelihood functions (L1-L7) was assessed by means of DREAM(ZS)approach. Four likelihood functions, L1-L4, Nash-Sutcliffe (NS) efficiency, Normalized absolute error (NAE), Index of agreement (IOA), and Chiew-McMahon efficiency (CM), is considered as informal, whereas remaining (L5-L7) is represented in formal category. L5 focuses on the relationship between the traditional least squares fitting and the Bayesian inference, and L6, is a hetereoscedastic maximum likelihood error (HMLE) estimator. Finally, in likelihood function L7, serial dependence of residual errors is accounted using a first-order autoregressive (AR) model of the residuals. According to the results, sensitivities of the parameters strongly depend on the likelihood function, and vary for different likelihood functions. Most of the parameters were better defined by formal likelihood functions L5 and L7 and showed a high sensitivity to model performance. Posterior cumulative distributions corresponding to the informal likelihood functions L1, L2, L3, L4 and the formal likelihood function L6 are approximately the same for most of the sub-basins, and these likelihood functions depict almost a similar effect on sensitivity of parameters. 95% total prediction uncertainty bounds bracketed most of the observed data. Considering all the statistical indicators and criteria of uncertainty assessment, including RMSE, KGE, NS, P-factor and R-factor, results showed that DREAM(ZS) algorithm performed better under formal likelihood functions L5 and L7, but likelihood function L5 may result in biased and unreliable estimation of parameters due to violation of the residualerror assumptions. Thus, likelihood function L7 provides posterior distribution of model parameters credibly and therefore can be employed for further applications.
The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

NASA Astrophysics Data System (ADS)

Ariffin, Syaiba Balqish; Midi, Habshah

2014-06-01

This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
Predicting Rotator Cuff Tears Using Data Mining and Bayesian Likelihood Ratios

PubMed Central

Lu, Hsueh-Yi; Huang, Chen-Yuan; Su, Chwen-Tzeng; Lin, Chen-Chiang

2014-01-01

Objectives Rotator cuff tear is a common cause of shoulder diseases. Correct diagnosis of rotator cuff tears can save patients from further invasive, costly and painful tests. This study used predictive data mining and Bayesian theory to improve the accuracy of diagnosing rotator cuff tears by clinical examination alone. Methods In this retrospective study, 169 patients who had a preliminary diagnosis of rotator cuff tear on the basis of clinical evaluation followed by confirmatory MRI between 2007 and 2011 were identified. MRI was used as a reference standard to classify rotator cuff tears. The predictor variable was the clinical assessment results, which consisted of 16 attributes. This study employed 2 data mining methods (ANN and the decision tree) and a statistical method (logistic regression) to classify the rotator cuff diagnosis into “tear” and “no tear” groups. Likelihood ratio and Bayesian theory were applied to estimate the probability of rotator cuff tears based on the results of the prediction models. Results Our proposed data mining procedures outperformed the classic statistical method. The correction rate, sensitivity, specificity and area under the ROC curve of predicting a rotator cuff tear were statistical better in the ANN and decision tree models compared to logistic regression. Based on likelihood ratios derived from our prediction models, Fagan's nomogram could be constructed to assess the probability of a patient who has a rotator cuff tear using a pretest probability and a prediction result (tear or no tear). Conclusions Our predictive data mining models, combined with likelihood ratios and Bayesian theory, appear to be good tools to classify rotator cuff tears as well as determine the probability of the presence of the disease to enhance diagnostic decision making for rotator cuff tears. PMID:24733553
Race Differences in Cardiac Catheterization: The Role of Social Contextual Variables

PubMed Central

Kressin, Nancy R.

2010-01-01

BACKGROUND Race differences in the receipt of invasive cardiac procedures are well-documented but the etiology remains poorly understood. OBJECTIVE We examined how social contextual variables were related to race differences in the likelihood of receiving cardiac catheterization in a sample of veterans who were recommended to undergo the procedure by a physician. DESIGN Prospective observational cohort study. PARTICIPANTS A subsample from a study examining race disparities in cardiac catheterization of 48 Black/African American and 189 White veterans who were recommended by a physician to undergo cardiac catheterization. MEASURES We assessed social contextual variables (e.g., knowing somebody who had the procedure, being encouraged by family or friends), clinical variables (e.g., hypertension, maximal medical therapy), and if participants received cardiac catheterization at any point during the study. KEY RESULTS Blacks/African Americans were less likely to undergo cardiac catheterization compared to Whites even after controlling for age, education, and clinical variables (OR = 0.31; 95% CI, 0.13, 0.75). After controlling for demographic and clinical variables, three social contextual variables were significantly related to increased likelihood of receiving catheterization: knowing someone who had undergone the procedure (OR = 3.14; 95% CI, 1.70, 8.74), social support (OR = 2.05; 95% CI, 1.17, 2.78), and being encouraged by family to have procedure (OR = 1.45; 95% CI, 1.08, 1.90). After adding the social contextual variables, race was no longer significantly related to the likelihood of receiving catheterization, thus suggesting that social context plays an important role in the relationship between race and cardiac catheterization. CONCLUSIONS Our results suggest that social contextual factors are related to the likelihood of receiving recommended care. In addition, accounting for these relationships attenuated the observed race disparities between Whites and Blacks/African Americans who were recommended to undergo cardiac catheterization by their physicians. PMID:20383600
Estimation of the Arrival Time and Duration of a Radio Signal with Unknown Amplitude and Initial Phase

NASA Astrophysics Data System (ADS)

Trifonov, A. P.; Korchagin, Yu. E.; Korol'kov, S. V.

2018-05-01

We synthesize the quasi-likelihood, maximum-likelihood, and quasioptimal algorithms for estimating the arrival time and duration of a radio signal with unknown amplitude and initial phase. The discrepancies between the hardware and software realizations of the estimation algorithm are shown. The characteristics of the synthesized-algorithm operation efficiency are obtained. Asymptotic expressions for the biases, variances, and the correlation coefficient of the arrival-time and duration estimates, which hold true for large signal-to-noise ratios, are derived. The accuracy losses of the estimates of the radio-signal arrival time and duration because of the a priori ignorance of the amplitude and initial phase are determined.

Species delimitation using Bayes factors: simulations and application to the Sceloporus scalaris species group (Squamata: Phrynosomatidae).

PubMed

Grummer, Jared A; Bryson, Robert W; Reeder, Tod W

2014-03-01

Current molecular methods of species delimitation are limited by the types of species delimitation models and scenarios that can be tested. Bayes factors allow for more flexibility in testing non-nested species delimitation models and hypotheses of individual assignment to alternative lineages. Here, we examined the efficacy of Bayes factors in delimiting species through simulations and empirical data from the Sceloporus scalaris species group. Marginal-likelihood scores of competing species delimitation models, from which Bayes factor values were compared, were estimated with four different methods: harmonic mean estimation (HME), smoothed harmonic mean estimation (sHME), path-sampling/thermodynamic integration (PS), and stepping-stone (SS) analysis. We also performed model selection using a posterior simulation-based analog of the Akaike information criterion through Markov chain Monte Carlo analysis (AICM). Bayes factor species delimitation results from the empirical data were then compared with results from the reversible-jump MCMC (rjMCMC) coalescent-based species delimitation method Bayesian Phylogenetics and Phylogeography (BP&P). Simulation results show that HME and sHME perform poorly compared with PS and SS marginal-likelihood estimators when identifying the true species delimitation model. Furthermore, Bayes factor delimitation (BFD) of species showed improved performance when species limits are tested by reassigning individuals between species, as opposed to either lumping or splitting lineages. In the empirical data, BFD through PS and SS analyses, as well as the rjMCMC method, each provide support for the recognition of all scalaris group taxa as independent evolutionary lineages. Bayes factor species delimitation and BP&P also support the recognition of three previously undescribed lineages. In both simulated and empirical data sets, harmonic and smoothed harmonic mean marginal-likelihood estimators provided much higher marginal-likelihood estimates than PS and SS estimators. The AICM displayed poor repeatability in both simulated and empirical data sets, and produced inconsistent model rankings across replicate runs with the empirical data. Our results suggest that species delimitation through the use of Bayes factors with marginal-likelihood estimates via PS or SS analyses provide a useful and complementary alternative to existing species delimitation methods.
Estimating population genetic parameters and comparing model goodness-of-fit using DNA sequences with error

PubMed Central

Liu, Xiaoming; Fu, Yun-Xin; Maxwell, Taylor J.; Boerwinkle, Eric

2010-01-01

It is known that sequencing error can bias estimation of evolutionary or population genetic parameters. This problem is more prominent in deep resequencing studies because of their large sample size n, and a higher probability of error at each nucleotide site. We propose a new method based on the composite likelihood of the observed SNP configurations to infer population mutation rate θ = 4Neμ, population exponential growth rate R, and error rate ɛ, simultaneously. Using simulation, we show the combined effects of the parameters, θ, n, ɛ, and R on the accuracy of parameter estimation. We compared our maximum composite likelihood estimator (MCLE) of θ with other θ estimators that take into account the error. The results show the MCLE performs well when the sample size is large or the error rate is high. Using parametric bootstrap, composite likelihood can also be used as a statistic for testing the model goodness-of-fit of the observed DNA sequences. The MCLE method is applied to sequence data on the ANGPTL4 gene in 1832 African American and 1045 European American individuals. PMID:19952140
INFERRING THE ECCENTRICITY DISTRIBUTION

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hogg, David W.; Bovy, Jo; Myers, Adam D., E-mail: david.hogg@nyu.ed

2010-12-20

Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementationmore » of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision-other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts-so long as the measurements have been communicated as a likelihood function or a posterior sampling.« less
Precision Parameter Estimation and Machine Learning

NASA Astrophysics Data System (ADS)

Wandelt, Benjamin D.

2008-12-01

I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.
Methods for estimating drought streamflow probabilities for Virginia streams

USGS Publications Warehouse

Austin, Samuel H.

2014-01-01

Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology

NASA Astrophysics Data System (ADS)

Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen

2018-07-01

Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper, we use massive asymptotically optimal data compression to reduce the dimensionality of the data space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parametrized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate DELFI with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological data sets.
Effects of time-shifted data on flight determined stability and control derivatives

NASA Technical Reports Server (NTRS)

Steers, S. T.; Iliff, K. W.

1975-01-01

Flight data were shifted in time by various increments to assess the effects of time shifts on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there was a considerable time shift in the data. Time shifts degraded the estimates of the derivatives, but the degradation was in a consistent rather than a random pattern. Time shifts in the control variables caused the most degradation, and the lateral-directional rotary derivatives were affected the most by time shifts in any variable.
Beyond valence in the perception of likelihood: the role of emotion specificity.

PubMed

DeSteno, D; Petty, R E; Wegener, D T; Rucker, D D

2000-03-01

Positive and negative moods have been shown to increase likelihood estimates of future events matching these states in valence (e.g., E. J. Johnson & A. Tversky, 1983). In the present article, 4 studies provide evidence that this congruency bias (a) is not limited to valence but functions in an emotion-specific manner, (b) derives from the informational value of emotions, and (c) is not the inevitable outcome of likelihood assessment under heightened emotion. Specifically, Study 1 demonstrates that sadness and anger, 2 distinct, negative emotions, differentially bias likelihood estimates of sad and angering events. Studies 2 and 3 replicate this finding in addition to supporting an emotion-as-information (cf. N. Schwarz & G. L. Clore, 1983), as opposed to a memory-based, mediating process for the bias. Finally, Study 4 shows that when the source of the emotion is salient, a reversal of the bias can occur given greater cognitive effort aimed at accuracy.
Pointwise nonparametric maximum likelihood estimator of stochastically ordered survivor functions

PubMed Central

Park, Yongseok; Taylor, Jeremy M. G.; Kalbfleisch, John D.

2012-01-01

In this paper, we consider estimation of survivor functions from groups of observations with right-censored data when the groups are subject to a stochastic ordering constraint. Many methods and algorithms have been proposed to estimate distribution functions under such restrictions, but none have completely satisfactory properties when the observations are censored. We propose a pointwise constrained nonparametric maximum likelihood estimator, which is defined at each time t by the estimates of the survivor functions subject to constraints applied at time t only. We also propose an efficient method to obtain the estimator. The estimator of each constrained survivor function is shown to be nonincreasing in t, and its consistency and asymptotic distribution are established. A simulation study suggests better small and large sample properties than for alternative estimators. An example using prostate cancer data illustrates the method. PMID:23843661
Closed-loop carrier phase synchronization techniques motivated by likelihood functions

NASA Technical Reports Server (NTRS)

Tsou, H.; Hinedi, S.; Simon, M.

1994-01-01

This article reexamines the notion of closed-loop carrier phase synchronization motivated by the theory of maximum a posteriori phase estimation with emphasis on the development of new structures based on both maximum-likelihood and average-likelihood functions. The criterion of performance used for comparison of all the closed-loop structures discussed is the mean-squared phase error for a fixed-loop bandwidth.
Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.

PubMed

dos Reis, Mario; Yang, Ziheng

2011-07-01

The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.
Maximum likelihood estimation of signal detection model parameters for the assessment of two-stage diagnostic strategies.

PubMed

Lirio, R B; Dondériz, I C; Pérez Abalo, M C

1992-08-01

The methodology of Receiver Operating Characteristic curves based on the signal detection model is extended to evaluate the accuracy of two-stage diagnostic strategies. A computer program is developed for the maximum likelihood estimation of parameters that characterize the sensitivity and specificity of two-stage classifiers according to this extended methodology. Its use is briefly illustrated with data collected in a two-stage screening for auditory defects.
Computing Maximum Likelihood Estimates of Loglinear Models from Marginal Sums with Special Attention to Loglinear Item Response Theory. [Project Psychometric Aspects of Item Banking No. 53.] Research Report 91-1.

ERIC Educational Resources Information Center

Kelderman, Henk

In this paper, algorithms are described for obtaining the maximum likelihood estimates of the parameters in log-linear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual counts in the full contingency table. This is…
Model selection and parameter estimation in structural dynamics using approximate Bayesian computation

NASA Astrophysics Data System (ADS)

Ben Abdessalem, Anis; Dervilis, Nikolaos; Wagg, David; Worden, Keith

2018-01-01

This paper will introduce the use of the approximate Bayesian computation (ABC) algorithm for model selection and parameter estimation in structural dynamics. ABC is a likelihood-free method typically used when the likelihood function is either intractable or cannot be approached in a closed form. To circumvent the evaluation of the likelihood function, simulation from a forward model is at the core of the ABC algorithm. The algorithm offers the possibility to use different metrics and summary statistics representative of the data to carry out Bayesian inference. The efficacy of the algorithm in structural dynamics is demonstrated through three different illustrative examples of nonlinear system identification: cubic and cubic-quintic models, the Bouc-Wen model and the Duffing oscillator. The obtained results suggest that ABC is a promising alternative to deal with model selection and parameter estimation issues, specifically for systems with complex behaviours.
Aircraft parameter estimation

NASA Technical Reports Server (NTRS)

Iliff, Kenneth W.

1987-01-01

The aircraft parameter estimation problem is used to illustrate the utility of parameter estimation, which applies to many engineering and scientific fields. Maximum likelihood estimation has been used to extract stability and control derivatives from flight data for many years. This paper presents some of the basic concepts of aircraft parameter estimation and briefly surveys the literature in the field. The maximum likelihood estimator is discussed, and the basic concepts of minimization and estimation are examined for a simple simulated aircraft example. The cost functions that are to be minimized during estimation are defined and discussed. Graphic representations of the cost functions are given to illustrate the minimization process. Finally, the basic concepts are generalized, and estimation from flight data is discussed. Some of the major conclusions for the simulated example are also developed for the analysis of flight data from the F-14, highly maneuverable aircraft technology (HiMAT), and space shuttle vehicles.
Estimation of submarine mass failure probability from a sequence of deposits with age dates

USGS Publications Warehouse

Geist, Eric L.; Chaytor, Jason D.; Parsons, Thomas E.; ten Brink, Uri S.

2013-01-01

The empirical probability of submarine mass failure is quantified from a sequence of dated mass-transport deposits. Several different techniques are described to estimate the parameters for a suite of candidate probability models. The techniques, previously developed for analyzing paleoseismic data, include maximum likelihood and Type II (Bayesian) maximum likelihood methods derived from renewal process theory and Monte Carlo methods. The estimated mean return time from these methods, unlike estimates from a simple arithmetic mean of the center age dates and standard likelihood methods, includes the effects of age-dating uncertainty and of open time intervals before the first and after the last event. The likelihood techniques are evaluated using Akaike’s Information Criterion (AIC) and Akaike’s Bayesian Information Criterion (ABIC) to select the optimal model. The techniques are applied to mass transport deposits recorded in two Integrated Ocean Drilling Program (IODP) drill sites located in the Ursa Basin, northern Gulf of Mexico. Dates of the deposits were constrained by regional bio- and magnetostratigraphy from a previous study. Results of the analysis indicate that submarine mass failures in this location occur primarily according to a Poisson process in which failures are independent and return times follow an exponential distribution. However, some of the model results suggest that submarine mass failures may occur quasiperiodically at one of the sites (U1324). The suite of techniques described in this study provides quantitative probability estimates of submarine mass failure occurrence, for any number of deposits and age uncertainty distributions.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.

PubMed

Gil, Manuel

2014-01-01

Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances

PubMed Central

2014-01-01

Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model

NASA Astrophysics Data System (ADS)

Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.

2014-02-01

Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can be successfully applied to process-based models of high complexity. The methodology is particularly suitable for heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models.
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model

NASA Astrophysics Data System (ADS)

Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.

2013-08-01

Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can successfully be applied to process-based models of high complexity. The methodology is particularly suited to heterogeneous and complex data structures and can easily be adjusted to other model types, including most stochastic population and individual-based models. Our study therefore provides a blueprint for a fairly general approach to parameter estimation of stochastic process-based models in ecology and evolution.

A Predictive Model to Estimate Cost Savings of a Novel Diagnostic Blood Panel for Diagnosis of Diarrhea-predominant Irritable Bowel Syndrome.

PubMed

Pimentel, Mark; Purdy, Chris; Magar, Raf; Rezaie, Ali

2016-07-01

A high incidence of irritable bowel syndrome (IBS) is associated with significant medical costs. Diarrhea-predominant IBS (IBS-D) is diagnosed on the basis of clinical presentation and diagnostic test results and procedures that exclude other conditions. This study was conducted to estimate the potential cost savings of a novel IBS diagnostic blood panel that tests for the presence of antibodies to cytolethal distending toxin B and anti-vinculin associated with IBS-D. A cost-minimization (CM) decision tree model was used to compare the costs of a novel IBS diagnostic blood panel pathway versus an exclusionary diagnostic pathway (ie, standard of care). The probability that patients proceed to treatment was modeled as a function of sensitivity, specificity, and likelihood ratios of the individual biomarker tests. One-way sensitivity analyses were performed for key variables, and a break-even analysis was performed for the pretest probability of IBS-D. Budget impact analysis of the CM model was extrapolated to a health plan with 1 million covered lives. The CM model (base-case) predicted $509 cost savings for the novel IBS diagnostic blood panel versus the exclusionary diagnostic pathway because of the avoidance of downstream testing (eg, colonoscopy, computed tomography scans). Sensitivity analysis indicated that an increase in both positive likelihood ratios modestly increased cost savings. Break-even analysis estimated that the pretest probability of disease would be 0.451 to attain cost neutrality. The budget impact analysis predicted a cost savings of $3,634,006 ($0.30 per member per month). The novel IBS diagnostic blood panel may yield significant cost savings by allowing patients to proceed to treatment earlier, thereby avoiding unnecessary testing. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Reconstruction of far-field tsunami amplitude distributions from earthquake sources

USGS Publications Warehouse

Geist, Eric L.; Parsons, Thomas E.

2016-01-01

The probability distribution of far-field tsunami amplitudes is explained in relation to the distribution of seismic moment at subduction zones. Tsunami amplitude distributions at tide gauge stations follow a similar functional form, well described by a tapered Pareto distribution that is parameterized by a power-law exponent and a corner amplitude. Distribution parameters are first established for eight tide gauge stations in the Pacific, using maximum likelihood estimation. A procedure is then developed to reconstruct the tsunami amplitude distribution that consists of four steps: (1) define the distribution of seismic moment at subduction zones; (2) establish a source-station scaling relation from regression analysis; (3) transform the seismic moment distribution to a tsunami amplitude distribution for each subduction zone; and (4) mix the transformed distribution for all subduction zones to an aggregate tsunami amplitude distribution specific to the tide gauge station. The tsunami amplitude distribution is adequately reconstructed for four tide gauge stations using globally constant seismic moment distribution parameters established in previous studies. In comparisons to empirical tsunami amplitude distributions from maximum likelihood estimation, the reconstructed distributions consistently exhibit higher corner amplitude values, implying that in most cases, the empirical catalogs are too short to include the largest amplitudes. Because the reconstructed distribution is based on a catalog of earthquakes that is much larger than the tsunami catalog, it is less susceptible to the effects of record-breaking events and more indicative of the actual distribution of tsunami amplitudes.
Interpretable inference on the mixed effect model with the Box-Cox transformation.

PubMed

Maruo, K; Yamaguchi, Y; Noma, H; Gosho, M

2017-07-10

We derived results for inference on parameters of the marginal model of the mixed effect model with the Box-Cox transformation based on the asymptotic theory approach. We also provided a robust variance estimator of the maximum likelihood estimator of the parameters of this model in consideration of the model misspecifications. Using these results, we developed an inference procedure for the difference of the model median between treatment groups at the specified occasion in the context of mixed effects models for repeated measures analysis for randomized clinical trials, which provided interpretable estimates of the treatment effect. From simulation studies, it was shown that our proposed method controlled type I error of the statistical test for the model median difference in almost all the situations and had moderate or high performance for power compared with the existing methods. We illustrated our method with cluster of differentiation 4 (CD4) data in an AIDS clinical trial, where the interpretability of the analysis results based on our proposed method is demonstrated. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Computational Aspects of N-Mixture Models

PubMed Central

Dennis, Emily B; Morgan, Byron JT; Ridout, Martin S

2015-01-01

The N-mixture model is widely used to estimate the abundance of a population in the presence of unknown detection probability from only a set of counts subject to spatial and temporal replication (Royle, 2004, Biometrics 60, 105–115). We explain and exploit the equivalence of N-mixture and multivariate Poisson and negative-binomial models, which provides powerful new approaches for fitting these models. We show that particularly when detection probability and the number of sampling occasions are small, infinite estimates of abundance can arise. We propose a sample covariance as a diagnostic for this event, and demonstrate its good performance in the Poisson case. Infinite estimates may be missed in practice, due to numerical optimization procedures terminating at arbitrarily large values. It is shown that the use of a bound, K, for an infinite summation in the N-mixture likelihood can result in underestimation of abundance, so that default values of K in computer packages should be avoided. Instead we propose a simple automatic way to choose K. The methods are illustrated by analysis of data on Hermann's tortoise Testudo hermanni. PMID:25314629
Estimating the probability for major gene Alzheimer disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Farrer, L.A.; Cupples, L.A.

1994-02-01

Alzheimer disease (AD) is a neuropsychiatric illness caused by multiple etiologies. Prediction of whether AD is genetically based in a given family is problematic because of censoring bias among unaffected relatives as a consequence of the late onset of the disorder, diagnostic uncertainties, heterogeneity, and limited information in a single family. The authors have developed a method based on Bayesian probability to compute values for a continuous variable that ranks AD families as having a major gene form of AD (MGAD). In addition, they have compared the Bayesian method with a maximum-likelihood approach. These methods incorporate sex- and age-adjusted riskmore » estimates and allow for phenocopies and familial clustering of age on onset. Agreement is high between the two approaches for ranking families as MGAD (Spearman rank [r] = .92). When either method is used, the numerical outcomes are sensitive to assumptions of the gene frequency and cumulative incidence of the disease in the population. Consequently, risk estimates should be used cautiously for counseling purposes; however, there are numerous valid applications of these procedures in genetic and epidemiological studies. 41 refs., 4 figs., 3 tabs.« less
Psychometric Properties of IRT Proficiency Estimates

ERIC Educational Resources Information Center

Kolen, Michael J.; Tong, Ye

2010-01-01

Psychometric properties of item response theory proficiency estimates are considered in this paper. Proficiency estimators based on summed scores and pattern scores include non-Bayes maximum likelihood and test characteristic curve estimators and Bayesian estimators. The psychometric properties investigated include reliability, conditional…
Quantum-state reconstruction by maximizing likelihood and entropy.

PubMed

Teo, Yong Siah; Zhu, Huangjun; Englert, Berthold-Georg; Řeháček, Jaroslav; Hradil, Zdeněk

2011-07-08

Quantum-state reconstruction on a finite number of copies of a quantum system with informationally incomplete measurements, as a rule, does not yield a unique result. We derive a reconstruction scheme where both the likelihood and the von Neumann entropy functionals are maximized in order to systematically select the most-likely estimator with the largest entropy, that is, the least-bias estimator, consistent with a given set of measurement data. This is equivalent to the joint consideration of our partial knowledge and ignorance about the ensemble to reconstruct its identity. An interesting structure of such estimators will also be explored.
Likelihoods for fixed rank nomination networks

PubMed Central

HOFF, PETER; FOSDICK, BAILEY; VOLFOVSKY, ALEX; STOVEL, KATHERINE

2014-01-01

Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design. PMID:25110586
Indices estimated using REML/BLUP and introduction of a super-trait for the selection of progenies in popcorn.

PubMed

Vittorazzi, C; Amaral Junior, A T; Guimarães, A G; Viana, A P; Silva, F H L; Pena, G F; Daher, R F; Gerhardt, I F S; Oliveira, G H F; Pereira, M G

2017-09-27

Selection indices commonly utilize economic weights, which become arbitrary genetic gains. In popcorn, this is even more evident due to the negative correlation between the main characteristics of economic importance - grain yield and popping expansion. As an option in the use of classical biometrics as a selection index, the optimal procedure restricted maximum likelihood/best linear unbiased predictor (REML/BLUP) allows the simultaneous estimation of genetic parameters and the prediction of genotypic values. Based on the mixed model methodology, the objective of this study was to investigate the comparative efficiency of eight selection indices estimated by REML/BLUP for the effective selection of superior popcorn families in the eighth intrapopulation recurrent selection cycle. We also investigated the efficiency of the inclusion of the variable "expanded popcorn volume per hectare" in the most advantageous selection of superior progenies. In total, 200 full-sib families were evaluated in two different areas in the North and Northwest regions of the State of Rio de Janeiro, Brazil. The REML/BLUP procedure resulted in higher estimated gains than those obtained with classical biometric selection index methodologies and should be incorporated into the selection of progenies. The following indices resulted in higher gains in the characteristics of greatest economic importance: the classical selection index/values attributed by trial, via REML/BLUP, and the greatest genotypic values/expanded popcorn volume per hectare, via REML. The expanded popcorn volume per hectare characteristic enabled satisfactory gains in grain yield and popping expansion; this characteristic should be considered super-trait in popcorn breeding programs.
Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

PubMed Central

Rice, John D.; Taylor, Jeremy M. G.

2016-01-01

One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492
Soil moisture assimilation using a modified ensemble transform Kalman filter with water balance constraint

NASA Astrophysics Data System (ADS)

Wu, Guocan; Zheng, Xiaogu; Dan, Bo

2016-04-01

The shallow soil moisture observations are assimilated into Common Land Model (CoLM) to estimate the soil moisture in different layers. The forecast error is inflated to improve the analysis state accuracy and the water balance constraint is adopted to reduce the water budget residual in the assimilation procedure. The experiment results illustrate that the adaptive forecast error inflation can reduce the analysis error, while the proper inflation layer can be selected based on the -2log-likelihood function of the innovation statistic. The water balance constraint can result in reducing water budget residual substantially, at a low cost of assimilation accuracy loss. The assimilation scheme can be potentially applied to assimilate the remote sensing data.
Avoiding overstating the strength of forensic evidence: Shrunk likelihood ratios/Bayes factors.

PubMed

Morrison, Geoffrey Stewart; Poh, Norman

2018-05-01

When strength of forensic evidence is quantified using sample data and statistical models, a concern may be raised as to whether the output of a model overestimates the strength of evidence. This is particularly the case when the amount of sample data is small, and hence sampling variability is high. This concern is related to concern about precision. This paper describes, explores, and tests three procedures which shrink the value of the likelihood ratio or Bayes factor toward the neutral value of one. The procedures are: (1) a Bayesian procedure with uninformative priors, (2) use of empirical lower and upper bounds (ELUB), and (3) a novel form of regularized logistic regression. As a benchmark, they are compared with linear discriminant analysis, and in some instances with non-regularized logistic regression. The behaviours of the procedures are explored using Monte Carlo simulated data, and tested on real data from comparisons of voice recordings, face images, and glass fragments. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Lateral stability and control derivatives of a jet fighter airplane extracted from flight test data by utilizing maximum likelihood estimation

NASA Technical Reports Server (NTRS)

Parrish, R. V.; Steinmetz, G. G.

1972-01-01

A method of parameter extraction for stability and control derivatives of aircraft from flight test data, implementing maximum likelihood estimation, has been developed and successfully applied to actual lateral flight test data from a modern sophisticated jet fighter. This application demonstrates the important role played by the analyst in combining engineering judgment and estimator statistics to yield meaningful results. During the analysis, the problems of uniqueness of the extracted set of parameters and of longitudinal coupling effects were encountered and resolved. The results for all flight runs are presented in tabular form and as time history comparisons between the estimated states and the actual flight test data.
Effect of sampling rate and record length on the determination of stability and control derivatives

NASA Technical Reports Server (NTRS)

Brenner, M. J.; Iliff, K. W.; Whitman, R. K.

1978-01-01

Flight data from five aircraft were used to assess the effects of sampling rate and record length reductions on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there were considerable reductions in sampling rate and/or record length. Small amplitude pulse maneuvers showed greater degradation of the derivative maneuvers than large amplitude pulse maneuvers when these reductions were made. Reducing the sampling rate was found to be more desirable than reducing the record length as a method of lessening the total computation time required without greatly degrading the quantity of the estimates.
Characterization, parameter estimation, and aircraft response statistics of atmospheric turbulence

NASA Technical Reports Server (NTRS)

Mark, W. D.

1981-01-01

A nonGaussian three component model of atmospheric turbulence is postulated that accounts for readily observable features of turbulence velocity records, their autocorrelation functions, and their spectra. Methods for computing probability density functions and mean exceedance rates of a generic aircraft response variable are developed using nonGaussian turbulence characterizations readily extracted from velocity recordings. A maximum likelihood method is developed for optimal estimation of the integral scale and intensity of records possessing von Karman transverse of longitudinal spectra. Formulas for the variances of such parameter estimates are developed. The maximum likelihood and least-square approaches are combined to yield a method for estimating the autocorrelation function parameters of a two component model for turbulence.
Objectively combining AR5 instrumental period and paleoclimate climate sensitivity evidence

NASA Astrophysics Data System (ADS)

Lewis, Nicholas; Grünwald, Peter

2018-03-01

Combining instrumental period evidence regarding equilibrium climate sensitivity with largely independent paleoclimate proxy evidence should enable a more constrained sensitivity estimate to be obtained. Previous, subjective Bayesian approaches involved selection of a prior probability distribution reflecting the investigators' beliefs about climate sensitivity. Here a recently developed approach employing two different statistical methods—objective Bayesian and frequentist likelihood-ratio—is used to combine instrumental period and paleoclimate evidence based on data presented and assessments made in the IPCC Fifth Assessment Report. Probabilistic estimates from each source of evidence are represented by posterior probability density functions (PDFs) of physically-appropriate form that can be uniquely factored into a likelihood function and a noninformative prior distribution. The three-parameter form is shown accurately to fit a wide range of estimated climate sensitivity PDFs. The likelihood functions relating to the probabilistic estimates from the two sources are multiplicatively combined and a prior is derived that is noninformative for inference from the combined evidence. A posterior PDF that incorporates the evidence from both sources is produced using a single-step approach, which avoids the order-dependency that would arise if Bayesian updating were used. Results are compared with an alternative approach using the frequentist signed root likelihood ratio method. Results from these two methods are effectively identical, and provide a 5-95% range for climate sensitivity of 1.1-4.05 K (median 1.87 K).
Maximal likelihood correspondence estimation for face recognition across pose.

PubMed

Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang

2014-10-01

Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
Speech perception at positive signal-to-noise ratios using adaptive adjustment of time compression.

PubMed

Schlueter, Anne; Brand, Thomas; Lemke, Ulrike; Nitzschner, Stefan; Kollmeier, Birger; Holube, Inga

2015-11-01

Positive signal-to-noise ratios (SNRs) characterize listening situations most relevant for hearing-impaired listeners in daily life and should therefore be considered when evaluating hearing aid algorithms. For this, a speech-in-noise test was developed and evaluated, in which the background noise is presented at fixed positive SNRs and the speech rate (i.e., the time compression of the speech material) is adaptively adjusted. In total, 29 younger and 12 older normal-hearing, as well as 24 older hearing-impaired listeners took part in repeated measurements. Younger normal-hearing and older hearing-impaired listeners conducted one of two adaptive methods which differed in adaptive procedure and step size. Analysis of the measurements with regard to list length and estimation strategy for thresholds resulted in a practical method measuring the time compression for 50% recognition. This method uses time-compression adjustment and step sizes according to Versfeld and Dreschler [(2002). J. Acoust. Soc. Am. 111, 401-408], with sentence scoring, lists of 30 sentences, and a maximum likelihood method for threshold estimation. Evaluation of the procedure showed that older participants obtained higher test-retest reliability compared to younger participants. Depending on the group of listeners, one or two lists are required for training prior to data collection.
Shape reconstruction of irregular bodies with multiple complementary data sources

NASA Astrophysics Data System (ADS)

Kaasalainen, M.; Viikinkoski, M.

2012-07-01

We discuss inversion methods for shape reconstruction with complementary data sources. The current main sources are photometry, adaptive optics or other images, occultation timings, and interferometry, and the procedure can readily be extended to include range-Doppler radar and thermal infrared data as well. We introduce the octantoid, a generally applicable shape support that can be automatically used for surface types encountered in planetary research, including strongly nonconvex or non-starlike shapes. We present models of Kleopatra and Hermione from multimodal data as examples of this approach. An important concept in this approach is the optimal weighting of the various data modes. We define the maximum compatibility estimate, a multimodal generalization of the maximum likelihood estimate, for this purpose. We also present a specific version of the procedure for asteroid flyby missions, with which one can reconstruct the complete shape of the target by using the flyby-based map of a part of the surface together with other available data. Finally, we show that the relative volume error of a shape solution is usually approximately equal to the relative shape error rather than its multiple. Our algorithms are trivially parallelizable, so running the code on a CUDA-enabled graphics processing unit is some two orders of magnitude faster than the usual single-processor mode.
Hip Implant Modified To Increase Probability Of Retention

NASA Technical Reports Server (NTRS)

Canabal, Francisco, III

1995-01-01

Modification in design of hip implant proposed to increase likelihood of retention of implant in femur after hip-repair surgery. Decreases likelihood of patient distress and expense associated with repetition of surgery after failed implant procedure. Intended to provide more favorable flow of cement used to bind implant in proximal extreme end of femur, reducing structural flaws causing early failure of implant/femur joint.

Restricted maximum likelihood estimation of genetic principal components and smoothed covariance matrices

PubMed Central

Meyer, Karin; Kirkpatrick, Mark

2005-01-01

Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given. PMID:15588566
A simulation study on Bayesian Ridge regression models for several collinearity levels

NASA Astrophysics Data System (ADS)

Efendi, Achmad; Effrihan

2017-12-01

When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.
A practical method to test the validity of the standard Gumbel distribution in logit-based multinomial choice models of travel behavior

DOE PAGES

Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...

2017-11-08

Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
A practical method to test the validity of the standard Gumbel distribution in logit-based multinomial choice models of travel behavior

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ye, Xin; Garikapati, Venu M.; You, Daehyun

Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Estimation of Gutenberg-Richter b-value using instrumental earthquake catalog from the southern Korean Peninsula

NASA Astrophysics Data System (ADS)

Lee, H.; Sheen, D.; Kim, S.

2013-12-01

The b-value in Gutenberg-Richter relation is an important parameter widely used not only in the interpretation of regional tectonic structure but in the seismic hazard analysis. In this study, we tested four methods for estimating the stable b-value in a small number of events using Monte-Carlo method. One is the Least-Squares method (LSM) which minimizes the observation error. Others are based on the Maximum Likelihood method (MLM) which maximizes the likelihood function: Utsu's (1965) method for continuous magnitudes and an infinite maximum magnitude, Page's (1968) for continuous magnitudes and a finite maximum magnitude, and Weichert's (1980) for interval magnitude and a finite maximum magnitude. A synthetic parent population of the earthquake catalog of million events from magnitude 2.0 to 7.0 with interval of 0.1 was generated for the Monte-Carlo simulation. The sample, the number of which was increased from 25 to 1000, was extracted from the parent population randomly. The resampling procedure was applied 1000 times with different random seed numbers. The mean and the standard deviation of the b-value were estimated for each sample group that has the same number of samples. As expected, the more samples were used, the more stable b-value was obtained. However, in a small number of events, the LSM gave generally low b-value with a large standard deviation while other MLMs gave more accurate and stable values. It was found that Utsu (1965) gives the most accurate and stable b-value even in a small number of events. It was also found that the selection of the minimum magnitude could be critical for estimating the correct b-value for Utsu's (1965) method and Page's (1968) if magnitudes were binned into an interval. Therefore, we applied Utsu (1965) to estimate the b-value using two instrumental earthquake catalogs, which have events occurred around the southern part of the Korean Peninsula from 1978 to 2011. By a careful choice of the minimum magnitude, the b-values of the earthquake catalogs of the Korea Meteorological Administration and Kim (2012) are estimated to be 0.72 and 0.74, respectively.
Assessing human health response in life cycle assessment using ED10s and DALYs: part 1--Cancer effects.

PubMed

Crettaz, Pierre; Pennington, David; Rhomberg, Lorenz; Brand, Kevin; Jolliet, Olivier

2002-10-01

Life cycle assessment (LCA) is a framework for comparing products according to their total estimated environmental impact, summed over all chemical emissions and activities associated with a product at all stages in its life cycle (from raw material acquisition, manufacturing, use, to final disposal). For each chemical involved, the exposure associated with the mass released into the environment, integrated over time and space, is multiplied by a toxicological measure to estimate the likelihood of effects and their potential consequences. In this article, we explore the use of quantitative methods drawn from conventional single-chemical regulatory risk assessments to create a procedure for the estimation of the cancer effect measure in the impact phase of LCA. The approach is based on the maximum likelihood estimate of the effect dose inducing a 10% response over background, ED10, and default linear low-dose extrapolation using the slope betaED10 (0.1/ED10). The calculated effects may correspond to residual risks below current regulatory compliance requirements that occur over multiple generations and at multiple locations; but at the very least they represent a "using up" of some portion of the human population's ability to accommodate emissions. Preliminary comparisons are performed with existing measures, such as the U.S. Environmental Protection Agency's (U.S. EPA's) slope factor measure q1*. By analyzing bioassay data for 44 chemicals drawn from the EPA's Integrated Risk Information System (IRIS) database, we explore estimating ED10 from more readily available information such as the median tumor dose rate TD50 and the median single lethal dose LD50. Based on the TD50, we then estimate the ED10 for more than 600 chemicals. Differences in potential consequences, or severity, are addressed by combining betaED10 with the measure disability adjusted life years per affected person, DALYp. Most of the variation among chemicals for cancer effects is found to be due to differences in the slope factors (betaED10) ranging from 10(-4) up to 10(4) (risk of cancer/mg/kg-day).
Maximum likelihood method for estimating airplane stability and control parameters from flight data in frequency domain

NASA Technical Reports Server (NTRS)

Klein, V.

1980-01-01

A frequency domain maximum likelihood method is developed for the estimation of airplane stability and control parameters from measured data. The model of an airplane is represented by a discrete-type steady state Kalman filter with time variables replaced by their Fourier series expansions. The likelihood function of innovations is formulated, and by its maximization with respect to unknown parameters the estimation algorithm is obtained. This algorithm is then simplified to the output error estimation method with the data in the form of transformed time histories, frequency response curves, or spectral and cross-spectral densities. The development is followed by a discussion on the equivalence of the cost function in the time and frequency domains, and on advantages and disadvantages of the frequency domain approach. The algorithm developed is applied in four examples to the estimation of longitudinal parameters of a general aviation airplane using computer generated and measured data in turbulent and still air. The cost functions in the time and frequency domains are shown to be equivalent; therefore, both approaches are complementary and not contradictory. Despite some computational advantages of parameter estimation in the frequency domain, this approach is limited to linear equations of motion with constant coefficients.
21 CFR 1301.90 - Employee screening procedures.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 21 Food and Drugs 9 2011-04-01 2011-04-01 false Employee screening procedures. 1301.90 Section 1301.90 Food and Drugs DRUG ENFORCEMENT ADMINISTRATION, DEPARTMENT OF JUSTICE REGISTRATION OF... by non-practitioners is vital to fairly assess the likelihood of an employee committing a drug...
Anderson v. University of Wisconsin: Handicap and Race Discrimination in Readmission Procedures.

ERIC Educational Resources Information Center

Smith, Elizabeth R.

1989-01-01

"Anderson v. University of Wisconsin" gives important guidance to universities by detailing the components of race and handicap discrimination claims, and illustrating how these claims can succeed. Readmission procedures that could reduce the likelihood of charges of discrimination are suggested. (Author/MLW)
Exponential series approaches for nonparametric graphical models

NASA Astrophysics Data System (ADS)

Janofsky, Eric

Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model

PubMed Central

Xu, Bo; Yang, Ziheng

2016-01-01

The multispecies coalescent (MSC) model has emerged as a powerful framework for inferring species phylogenies while accounting for ancestral polymorphism and gene tree-species tree conflict. A number of methods have been developed in the past few years to estimate the species tree under the MSC. The full likelihood methods (including maximum likelihood and Bayesian inference) average over the unknown gene trees and accommodate their uncertainties properly but involve intensive computation. The approximate or summary coalescent methods are computationally fast and are applicable to genomic datasets with thousands of loci, but do not make an efficient use of information in the multilocus data. Most of them take the two-step approach of reconstructing the gene trees for multiple loci by phylogenetic methods and then treating the estimated gene trees as observed data, without accounting for their uncertainties appropriately. In this article we review the statistical nature of the species tree estimation problem under the MSC, and explore the conceptual issues and challenges of species tree estimation by focusing mainly on simple cases of three or four closely related species. We use mathematical analysis and computer simulation to demonstrate that large differences in statistical performance may exist between the two classes of methods. We illustrate that several counterintuitive behaviors may occur with the summary methods but they are due to inefficient use of information in the data by summary methods and vanish when the data are analyzed using full-likelihood methods. These include (i) unidentifiability of parameters in the model, (ii) inconsistency in the so-called anomaly zone, (iii) singularity on the likelihood surface, and (iv) deterioration of performance upon addition of more data. We discuss the challenges and strategies of species tree inference for distantly related species when the molecular clock is violated, and highlight the need for improving the computational efficiency and model realism of the likelihood methods as well as the statistical efficiency of the summary methods. PMID:27927902
Progress towards a National Cardiac Procedure Database--development of the Australasian Society of Cardiac and Thoracic Surgeons (ASCTS) and Melbourne Interventional Group (MIG) registries.

PubMed

Chan, William; Clark, David J; Ajani, Andrew E; Yap, Cheng-Hon; Andrianopoulos, Nick; Brennan, Angela L; Dinh, Diem T; Shardey, Gilbert C; Smith, Julian A; Reid, Christopher M; Duffy, Stephen J

2011-01-01

Since the call for a National Cardiac Procedures Database in 2001, much work has been accomplished in both cardiac surgery and interventional cardiology in an attempt to establish a unified, systematic approach to data collection, defining a common minimum dataset pertinent to the Australian context, and instituting quality control measures to ensure integrity and privacy of data. In this paper we outline the aims of the Australasian Society of Cardiac and Thoracic Surgeons (ASCTS) and the Melbourne Interventional Group (MIG) registries, and propose a comprehensive set of standardised data elements and their definitions to facilitate transparency in data collection, consistency between these and other data sets, and encourage ongoing peer-review. The aims are to improve outcomes for patients by determining key performance indicators and standards of performance for hospital units, to allow estimation of procedural risks and likelihood of outcomes for patients, and to report outcomes to relevant stake-holders and the public. Copyright © 2010 Australasian Society of Cardiac and Thoracic Surgeons and the Cardiac Society of Australia and New Zealand. Published by Elsevier B.V. All rights reserved.
ERS-1 scatterometer calibration and validation activities at ECMWF. B: From radar backscatter characteristics to wind vector solutions

NASA Technical Reports Server (NTRS)

Stoffelen, AD; Anderson, David L. T.; Woiceshyn, Peter M.

1992-01-01

Calibration and validation activities for the ERS-1 scatterometer were carried out at ECMWF (European Center for Medium range Weather Forecast) complementary to the 'Haltenbanken' field campaign off the coast of Norway. At a Numerical Weather Prediction (NWP) center a wealth of verifying data is available both in time and space. This data is used to redefine the wind retrieval procedure given the instrumental characteristics. It was found that a maximum likelihood estimation procedure to obtain the coefficients of a reformulated sigma deg to wind relationship should use radar measurements in logarithmic rather than physical space, and use winds as the wind components rather than wind speed and direction. Doing this, a much more accurate transfer function than the one currently operated by ESA was derived. Sigma deg measurement space shows no signature of a separation in an upwind solution cone and a downwind solution cone. As such signature was anticipated in ESA's wind direction ambiguity removal algorithm, reconsideration of the procedure is necessary. Despite the fact that revisions have to be made in the process of wind retrieval; a grid potential is shown for scatterometry in meteorology and climatology.
Estimation of rates-across-sites distributions in phylogenetic substitution models.

PubMed

Susko, Edward; Field, Chris; Blouin, Christian; Roger, Andrew J

2003-10-01

Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma. In this article, we introduce discrete distribution estimates with large numbers of equally spaced rate categories allowing us to investigate the appropriateness of the gamma model. With large numbers of rate categories, these discrete estimates are flexible enough to approximate the shape of almost any distribution. Likelihood ratio statistical tests and a nonparametric bootstrap confidence-bound estimation procedure based on the discrete estimates are presented that can be used to test the fit of a parametric family. We applied the methodology to several different protein data sets, and found that although the gamma model often provides a good parametric model for this type of data, rate estimates from an equal-probability discrete gamma model with a small number of categories will tend to underestimate the largest rates. In cases when the gamma model assumption is in doubt, rate estimates coming from the discrete rate distribution estimate with a large number of rate categories provide a robust alternative to gamma estimates. An alternative implementation of the gamma distribution is proposed that, for equal numbers of rate categories, is computationally more efficient during optimization than the standard gamma implementation and can provide more accurate estimates of site rates.
COSMIC MICROWAVE BACKGROUND LIKELIHOOD APPROXIMATION FOR BANDED PROBABILITY DISTRIBUTIONS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gjerløw, E.; Mikkelsen, K.; Eriksen, H. K.

We investigate sets of random variables that can be arranged sequentially such that a given variable only depends conditionally on its immediate predecessor. For such sets, we show that the full joint probability distribution may be expressed exclusively in terms of uni- and bivariate marginals. Under the assumption that the cosmic microwave background (CMB) power spectrum likelihood only exhibits correlations within a banded multipole range, Δl{sub C}, we apply this expression to two outstanding problems in CMB likelihood analysis. First, we derive a statistically well-defined hybrid likelihood estimator, merging two independent (e.g., low- and high-l) likelihoods into a single expressionmore » that properly accounts for correlations between the two. Applying this expression to the Wilkinson Microwave Anisotropy Probe (WMAP) likelihood, we verify that the effect of correlations on cosmological parameters in the transition region is negligible in terms of cosmological parameters for WMAP; the largest relative shift seen for any parameter is 0.06σ. However, because this may not hold for other experimental setups (e.g., for different instrumental noise properties or analysis masks), but must rather be verified on a case-by-case basis, we recommend our new hybridization scheme for future experiments for statistical self-consistency reasons. Second, we use the same expression to improve the convergence rate of the Blackwell-Rao likelihood estimator, reducing the required number of Monte Carlo samples by several orders of magnitude, and thereby extend it to high-l applications.« less
Development of the Average Likelihood Function for Code Division Multiple Access (CDMA) Using BPSK and QPSK Symbols

DTIC Science & Technology

2015-01-01

This research has the purpose to establish a foundation for new classification and estimation of CDMA signals. Keywords: DS / CDMA signals, BPSK, QPSK...DEVELOPMENT OF THE AVERAGE LIKELIHOOD FUNCTION FOR CODE DIVISION MULTIPLE ACCESS ( CDMA ) USING BPSK AND QPSK SYMBOLS JANUARY 2015...To) OCT 2013 – OCT 2014 4. TITLE AND SUBTITLE DEVELOPMENT OF THE AVERAGE LIKELIHOOD FUNCTION FOR CODE DIVISION MULTIPLE ACCESS ( CDMA ) USING BPSK
Estimation of brood and nest survival: Comparative methods in the presence of heterogeneity

USGS Publications Warehouse

Manly, Bryan F.J.; Schmutz, Joel A.

2001-01-01

The Mayfield method has been widely used for estimating survival of nests and young animals, especially when data are collected at irregular observation intervals. However, this method assumes survival is constant throughout the study period, which often ignores biologically relevant variation and may lead to biased survival estimates. We examined the bias and accuracy of 1 modification to the Mayfield method that allows for temporal variation in survival, and we developed and similarly tested 2 additional methods. One of these 2 new methods is simply an iterative extension of Klett and Johnson's method, which we refer to as the Iterative Mayfield method and bears similarity to Kaplan-Meier methods. The other method uses maximum likelihood techniques for estimation and is best applied to survival of animals in groups or families, rather than as independent individuals. We also examined how robust these estimators are to heterogeneity in the data, which can arise from such sources as dependent survival probabilities among siblings, inherent differences among families, and adoption. Testing of estimator performance with respect to bias, accuracy, and heterogeneity was done using simulations that mimicked a study of survival of emperor goose (Chen canagica) goslings. Assuming constant survival for inappropriately long periods of time or use of Klett and Johnson's methods resulted in large bias or poor accuracy (often >5% bias or root mean square error) compared to our Iterative Mayfield or maximum likelihood methods. Overall, estimator performance was slightly better with our Iterative Mayfield than our maximum likelihood method, but the maximum likelihood method provides a more rigorous framework for testing covariates and explicity models a heterogeneity factor. We demonstrated use of all estimators with data from emperor goose goslings. We advocate that future studies use the new methods outlined here rather than the traditional Mayfield method or its previous modifications.
A state space approach for piecewise-linear recurrent neural networks for identifying computational dynamics from neural measurements.

PubMed

Durstewitz, Daniel

2017-06-01

The computational and cognitive properties of neural systems are often thought to be implemented in terms of their (stochastic) network dynamics. Hence, recovering the system dynamics from experimentally observed neuronal time series, like multiple single-unit recordings or neuroimaging data, is an important step toward understanding its computations. Ideally, one would not only seek a (lower-dimensional) state space representation of the dynamics, but would wish to have access to its statistical properties and their generative equations for in-depth analysis. Recurrent neural networks (RNNs) are a computationally powerful and dynamically universal formal framework which has been extensively studied from both the computational and the dynamical systems perspective. Here we develop a semi-analytical maximum-likelihood estimation scheme for piecewise-linear RNNs (PLRNNs) within the statistical framework of state space models, which accounts for noise in both the underlying latent dynamics and the observation process. The Expectation-Maximization algorithm is used to infer the latent state distribution, through a global Laplace approximation, and the PLRNN parameters iteratively. After validating the procedure on toy examples, and using inference through particle filters for comparison, the approach is applied to multiple single-unit recordings from the rodent anterior cingulate cortex (ACC) obtained during performance of a classical working memory task, delayed alternation. Models estimated from kernel-smoothed spike time data were able to capture the essential computational dynamics underlying task performance, including stimulus-selective delay activity. The estimated models were rarely multi-stable, however, but rather were tuned to exhibit slow dynamics in the vicinity of a bifurcation point. In summary, the present work advances a semi-analytical (thus reasonably fast) maximum-likelihood estimation framework for PLRNNs that may enable to recover relevant aspects of the nonlinear dynamics underlying observed neuronal time series, and directly link these to computational properties.
Marginal and Random Intercepts Models for Longitudinal Binary Data With Examples From Criminology.

PubMed

Long, Jeffrey D; Loeber, Rolf; Farrington, David P

2009-01-01

Two models for the analysis of longitudinal binary data are discussed: the marginal model and the random intercepts model. In contrast to the linear mixed model (LMM), the two models for binary data are not subsumed under a single hierarchical model. The marginal model provides group-level information whereas the random intercepts model provides individual-level information including information about heterogeneity of growth. It is shown how a type of numerical averaging can be used with the random intercepts model to obtain group-level information, thus approximating individual and marginal aspects of the LMM. The types of inferences associated with each model are illustrated with longitudinal criminal offending data based on N = 506 males followed over a 22-year period. Violent offending indexed by official records and self-report were analyzed, with the marginal model estimated using generalized estimating equations and the random intercepts model estimated using maximum likelihood. The results show that the numerical averaging based on the random intercepts can produce prediction curves almost identical to those obtained directly from the marginal model parameter estimates. The results provide a basis for contrasting the models and the estimation procedures and key features are discussed to aid in selecting a method for empirical analysis.
The Benefits of Maximum Likelihood Estimators in Predicting Bulk Permeability and Upscaling Fracture Networks

NASA Astrophysics Data System (ADS)

Emanuele Rizzo, Roberto; Healy, David; De Siena, Luca

2016-04-01

The success of any predictive model is largely dependent on the accuracy with which its parameters are known. When characterising fracture networks in fractured rock, one of the main issues is accurately scaling the parameters governing the distribution of fracture attributes. Optimal characterisation and analysis of fracture attributes (lengths, apertures, orientations and densities) is fundamental to the estimation of permeability and fluid flow, which are of primary importance in a number of contexts including: hydrocarbon production from fractured reservoirs; geothermal energy extraction; and deeper Earth systems, such as earthquakes and ocean floor hydrothermal venting. Our work links outcrop fracture data to modelled fracture networks in order to numerically predict bulk permeability. We collected outcrop data from a highly fractured upper Miocene biosiliceous mudstone formation, cropping out along the coastline north of Santa Cruz (California, USA). Using outcrop fracture networks as analogues for subsurface fracture systems has several advantages, because key fracture attributes such as spatial arrangements and lengths can be effectively measured only on outcrops [1]. However, a limitation when dealing with outcrop data is the relative sparseness of natural data due to the intrinsic finite size of the outcrops. We make use of a statistical approach for the overall workflow, starting from data collection with the Circular Windows Method [2]. Then we analyse the data statistically using Maximum Likelihood Estimators, which provide greater accuracy compared to the more commonly used Least Squares linear regression when investigating distribution of fracture attributes. Finally, we estimate the bulk permeability of the fractured rock mass using Oda's tensorial approach [3]. The higher quality of this statistical analysis is fundamental: better statistics of the fracture attributes means more accurate permeability estimation, since the fracture attributes feed directly into the permeability calculations. The application of Maximum Likelihood Estimators can have important consequences, especially when we aim to predict the tendency of fracture attributes towards smaller and larger scales than those observed, in order to build consistent, useable models from outcrop observations. The procedures presented here aim to understand whether the average permeability of a fracture network can be predicted, reducing its uncertainties; and if outcrop measurements of fracture attributes can be used directly to generate statistically identical fracture network models, which can then be easily up-scaled into larger areas or volumes. Gale et al. "Natural Fracture in shale: A review and new observations", AAPG Bulletin 98.11 (2014). Mauldon et al. "Circular scanlines and circular windows: new tools for characterizing the geometry of fracture traces", Journal of Structural Geology, 23 (2001). Oda "Permeability tensor for discontinuous rock masses", Geotechnique 35.4 (1985).

Estimating temporary emigration and breeding proportions using capture-recapture data with Pollock's robust design

USGS Publications Warehouse

Kendall, W.L.; Nichols, J.D.; Hines, J.E.

1997-01-01

Statistical inference for capture-recapture studies of open animal populations typically relies on the assumption that all emigration from the studied population is permanent. However, there are many instances in which this assumption is unlikely to be met. We define two general models for the process of temporary emigration, completely random and Markovian. We then consider effects of these two types of temporary emigration on Jolly-Seber (Seber 1982) estimators and on estimators arising from the full-likelihood approach of Kendall et al. (1995) to robust design data. Capture-recapture data arising from Pollock's (1982) robust design provide the basis for obtaining unbiased estimates of demographic parameters in the presence of temporary emigration and for estimating the probability of temporary emigration. We present a likelihood-based approach to dealing with temporary emigration that permits estimation under different models of temporary emigration and yields tests for completely random and Markovian emigration. In addition, we use the relationship between capture probability estimates based on closed and open models under completely random temporary emigration to derive three ad hoc estimators for the probability of temporary emigration, two of which should be especially useful in situations where capture probabilities are heterogeneous among individual animals. Ad hoc and full-likelihood estimators are illustrated for small mammal capture-recapture data sets. We believe that these models and estimators will be useful for testing hypotheses about the process of temporary emigration, for estimating demographic parameters in the presence of temporary emigration, and for estimating probabilities of temporary emigration. These latter estimates are frequently of ecological interest as indicators of animal movement and, in some sampling situations, as direct estimates of breeding probabilities and proportions.
Modified Maxium Likelihood Estimation Method for Completely Separated and Quasi-Completely Separated Data for a Dose-Response Model

DTIC Science & Technology

2015-08-01

McCullagh, P.; Nelder, J.A. Generalized Linear Model , 2nd ed.; Chapman and Hall: London, 1989. 7. Johnston, J. Econometric Methods, 3rd ed.; McGraw...FOR A DOSE-RESPONSE MODEL ECBC-TN-068 Kyong H. Park Steven J. Lagan RESEARCH AND TECHNOLOGY DIRECTORATE August 2015 Approved for public release...Likelihood Estimation Method for Completely Separated and Quasi-Completely Separated Data for a Dose-Response Model 5a. CONTRACT NUMBER 5b. GRANT
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.

PubMed

Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram

2017-02-01

In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.
Has Metal-On-Metal Resurfacing Been a Cost-Effective Intervention for Health Care Providers?-A Registry Based Study.

PubMed

Pulikottil-Jacob, Ruth; Connock, Martin; Kandala, Ngianga-Bakwin; Mistry, Hema; Grove, Amy; Freeman, Karoline; Costa, Matthew; Sutcliffe, Paul; Clarke, Aileen

2016-01-01

Total hip replacement for end stage arthritis of the hip is currently the most common elective surgical procedure. In 2007 about 7.5% of UK implants were metal-on-metal joint resurfacing (MoM RS) procedures. Due to poor revision performance and concerns about metal debris, the use of RS had declined by 2012 to about a 1% share of UK hip procedures. This study estimated the lifetime cost-effectiveness of metal-on-metal resurfacing (RS) procedures versus commonly employed total hip replacement (THR) methods. We performed a cost-utility analysis using a well-established multi-state semi-Markov model from an NHS and personal and social services perspective. We used individual patient data (IPD) from the National Joint Registry (NJR) for England and Wales on RS and THR surgery for osteoarthritis recorded from April 2003 to December 2012. We used flexible parametric modelling of NJR RS data to guide identification of patient subgroups and RS devices which delivered revision rates within the NICE 5% revision rate benchmark at 10 years. RS procedures overall have an estimated revision rate of 13% at 10 years, compared to <4% for most THR devices. New NICE guidance now recommends a revision rate benchmark of <5% at 10 years. 60% of RS implants in men and 2% in women were predicted to be within the revision benchmark. RS devices satisfying the 5% benchmark were unlikely to be cost-effective compared to THR at a standard UK willingness to pay of £20,000 per quality-adjusted life-year. However, the probability of cost effectiveness was sensitive to small changes in the costs of devices or in quality of life or revision rate estimates. Our results imply that in most cases RS has not been a cost-effective resource and should probably not be adopted by decision makers concerned with the cost effectiveness of hip replacement, or by patients concerned about the likelihood of revision, regardless of patient age or gender.
A Game Theoretical Approach to Hacktivism: Is Attack Likelihood a Product of Risks and Payoffs?

PubMed

Bodford, Jessica E; Kwan, Virginia S Y

2018-02-01

The current study examines hacktivism (i.e., hacking to convey a moral, ethical, or social justice message) through a general game theoretic framework-that is, as a product of costs and benefits. Given the inherent risk of carrying out a hacktivist attack (e.g., legal action, imprisonment), it would be rational for the user to weigh these risks against perceived benefits of carrying out the attack. As such, we examined computer science students' estimations of risks, payoffs, and attack likelihood through a game theoretic design. Furthermore, this study aims at constructing a descriptive profile of potential hacktivists, exploring two predicted covariates of attack decision making, namely, peer prevalence of hacking and sex differences. Contrary to expectations, results suggest that participants' estimations of attack likelihood stemmed solely from expected payoffs, rather than subjective risks. Peer prevalence significantly predicted increased payoffs and attack likelihood, suggesting an underlying descriptive norm in social networks. Notably, we observed no sex differences in the decision to attack, nor in the factors predicting attack likelihood. Implications for policymakers and the understanding and prevention of hacktivism are discussed, as are the possible ramifications of widely communicated payoffs over potential risks in hacking communities.
MLESAC Based Localization of Needle Insertion Using 2D Ultrasound Images

NASA Astrophysics Data System (ADS)

Xu, Fei; Gao, Dedong; Wang, Shan; Zhanwen, A.

2018-04-01

In the 2D ultrasound image of ultrasound-guided percutaneous needle insertions, it is difficult to determine the positions of needle axis and tip because of the existence of artifacts and other noises. In this work the speckle is regarded as the noise of an ultrasound image, and a novel algorithm is presented to detect the needle in a 2D ultrasound image. Firstly, the wavelet soft thresholding technique based on BayesShrink rule is used to denoise the speckle of ultrasound image. Secondly, we add Otsu’s thresholding method and morphologic operations to pre-process the ultrasound image. Finally, the localization of the needle is identified and positioned in the 2D ultrasound image based on the maximum likelihood estimation sample consensus (MLESAC) algorithm. The experimental results show that it is valid for estimating the position of needle axis and tip in the ultrasound images with the proposed algorithm. The research work is hopeful to be used in the path planning and robot-assisted needle insertion procedures.
Improving Anesthesia Nurse Compliance with Universal Precautions Using Group Goals and Public Feedback

ERIC Educational Resources Information Center

Stephens, Sara D.; Ludwig, Timothy D.

2005-01-01

Universal Precautions (UPs), procedures to reduce the likelihood of accidental exposure to blood-borne pathogens, were observed among seven Certified Nurse Anesthetists and one anesthesia technician during intravenous line procedures. After six weeks of baseline measures, nurses participated in training, goal setting, and feedback targeting hand…
32 CFR Appendix A to Part 223 - Procedures for Identifying and Controlling DoD UCNI

Code of Federal Regulations, 2011 CFR

2011-07-01

... security measures, including security plans, procedures, and equipment, for the physical protection of DoD... stand-alone personal computers, or shared-logic work processing systems, if protection from unauthorized... and security by increasing significantly the likelihood of the illegal production of nuclear weapons...
Scanning linear estimation: improvements over region of interest (ROI) methods

NASA Astrophysics Data System (ADS)

Kupinski, Meredith K.; Clarkson, Eric W.; Barrett, Harrison H.

2013-03-01

In tomographic medical imaging, a signal activity is typically estimated by summing voxels from a reconstructed image. We introduce an alternative estimation scheme that operates on the raw projection data and offers a substantial improvement, as measured by the ensemble mean-square error (EMSE), when compared to using voxel values from a maximum-likelihood expectation-maximization (MLEM) reconstruction. The scanning-linear (SL) estimator operates on the raw projection data and is derived as a special case of maximum-likelihood estimation with a series of approximations to make the calculation tractable. The approximated likelihood accounts for background randomness, measurement noise and variability in the parameters to be estimated. When signal size and location are known, the SL estimate of signal activity is unbiased, i.e. the average estimate equals the true value. By contrast, unpredictable bias arising from the null functions of the imaging system affect standard algorithms that operate on reconstructed data. The SL method is demonstrated for two different tasks: (1) simultaneously estimating a signal’s size, location and activity; (2) for a fixed signal size and location, estimating activity. Noisy projection data are realistically simulated using measured calibration data from the multi-module multi-resolution small-animal SPECT imaging system. For both tasks, the same set of images is reconstructed using the MLEM algorithm (80 iterations), and the average and maximum values within the region of interest (ROI) are calculated for comparison. This comparison shows dramatic improvements in EMSE for the SL estimates. To show that the bias in ROI estimates affects not only absolute values but also relative differences, such as those used to monitor the response to therapy, the activity estimation task is repeated for three different signal sizes.
Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood

PubMed Central

Bondell, Howard D.; Stefanski, Leonard A.

2013-01-01

Large- and finite-sample efficiency and resistance to outliers are the key goals of robust statistics. Although often not simultaneously attainable, we develop and study a linear regression estimator that comes close. Efficiency obtains from the estimator’s close connection to generalized empirical likelihood, and its favorable robustness properties are obtained by constraining the associated sum of (weighted) squared residuals. We prove maximum attainable finite-sample replacement breakdown point, and full asymptotic efficiency for normal errors. Simulation evidence shows that compared to existing robust regression estimators, the new estimator has relatively high efficiency for small sample sizes, and comparable outlier resistance. The estimator is further illustrated and compared to existing methods via application to a real data set with purported outliers. PMID:23976805
Nonparametric Discrete Survival Function Estimation with Uncertain Endpoints Using an Internal Validation Subsample

PubMed Central

Zee, Jarcy; Xie, Sharon X.

2015-01-01

Summary When a true survival endpoint cannot be assessed for some subjects, an alternative endpoint that measures the true endpoint with error may be collected, which often occurs when obtaining the true endpoint is too invasive or costly. We develop an estimated likelihood function for the situation where we have both uncertain endpoints for all participants and true endpoints for only a subset of participants. We propose a nonparametric maximum estimated likelihood estimator of the discrete survival function of time to the true endpoint. We show that the proposed estimator is consistent and asymptotically normal. We demonstrate through extensive simulations that the proposed estimator has little bias compared to the naïve Kaplan-Meier survival function estimator, which uses only uncertain endpoints, and more efficient with moderate missingness compared to the complete-case Kaplan-Meier survival function estimator, which uses only available true endpoints. Finally, we apply the proposed method to a dataset for estimating the risk of developing Alzheimer's disease from the Alzheimer's Disease Neuroimaging Initiative. PMID:25916510
Maximum likelihood phase-retrieval algorithm: applications.

PubMed

Nahrstedt, D A; Southwell, W H

1984-12-01

The maximum likelihood estimator approach is shown to be effective in determining the wave front aberration in systems involving laser and flow field diagnostics and optical testing. The robustness of the algorithm enables convergence even in cases of severe wave front error and real, nonsymmetrical, obscured amplitude distributions.
Cramer-Rao Bound, MUSIC, and Maximum Likelihood. Effects of Temporal Phase Difference

DTIC Science & Technology

1990-11-01

Technical Report 1373 November 1990 Cramer-Rao Bound, MUSIC , And Maximum Likelihood Effects of Temporal Phase o Difference C. V. TranI OTIC Approved... MUSIC , and Maximum Likelihood (ML) asymptotic variances corresponding to the two-source direction-of-arrival estimation where sources were modeled as...1pI = 1.00, SNR = 20 dB ..................................... 27 2. MUSIC for two equipowered signals impinging on a 5-element ULA (a) IpI = 0.50, SNR
Bayesian experimental design for models with intractable likelihoods.

PubMed

Drovandi, Christopher C; Pettitt, Anthony N

2013-12-01

In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
Robust Multi-Frame Adaptive Optics Image Restoration Algorithm Using Maximum Likelihood Estimation with Poisson Statistics.

PubMed

Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan

2017-04-06

An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Robust Multi-Frame Adaptive Optics Image Restoration Algorithm Using Maximum Likelihood Estimation with Poisson Statistics

PubMed Central

Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan

2017-01-01

An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
GRID-BASED EXPLORATION OF COSMOLOGICAL PARAMETER SPACE WITH SNAKE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mikkelsen, K.; Næss, S. K.; Eriksen, H. K., E-mail: kristin.mikkelsen@astro.uio.no

2013-11-10

We present a fully parallelized grid-based parameter estimation algorithm for investigating multidimensional likelihoods called Snake, and apply it to cosmological parameter estimation. The basic idea is to map out the likelihood grid-cell by grid-cell according to decreasing likelihood, and stop when a certain threshold has been reached. This approach improves vastly on the 'curse of dimensionality' problem plaguing standard grid-based parameter estimation simply by disregarding grid cells with negligible likelihood. The main advantages of this method compared to standard Metropolis-Hastings Markov Chain Monte Carlo methods include (1) trivial extraction of arbitrary conditional distributions; (2) direct access to Bayesian evidences; (3)more » better sampling of the tails of the distribution; and (4) nearly perfect parallelization scaling. The main disadvantage is, as in the case of brute-force grid-based evaluation, a dependency on the number of parameters, N{sub par}. One of the main goals of the present paper is to determine how large N{sub par} can be, while still maintaining reasonable computational efficiency; we find that N{sub par} = 12 is well within the capabilities of the method. The performance of the code is tested by comparing cosmological parameters estimated using Snake and the WMAP-7 data with those obtained using CosmoMC, the current standard code in the field. We find fully consistent results, with similar computational expenses, but shorter wall time due to the perfect parallelization scheme.« less
WEIGHTED LIKELIHOOD ESTIMATION UNDER TWO-PHASE SAMPLING

PubMed Central

Saegusa, Takumi; Wellner, Jon A.

2013-01-01

We develop asymptotic theory for weighted likelihood estimators (WLE) under two-phase stratified sampling without replacement. We also consider several variants of WLEs involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko–Cantelli theorem, a theorem for rates of convergence of M-estimators, and a Donsker theorem for the inverse probability weighted empirical processes under two-phase sampling and sampling without replacement at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a finite-dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is estimable either at regular or nonregular rates. We illustrate these results and methods in the Cox model with right censoring and interval censoring. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase. PMID:24563559
Maximum Likelihood Estimation of Nonlinear Structural Equation Models with Ignorable Missing Data

ERIC Educational Resources Information Center

Lee, Sik-Yum; Song, Xin-Yuan; Lee, John C. K.

2003-01-01

The existing maximum likelihood theory and its computer software in structural equation modeling are established on the basis of linear relationships among latent variables with fully observed data. However, in social and behavioral sciences, nonlinear relationships among the latent variables are important for establishing more meaningful models…
A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses

ERIC Educational Resources Information Center

Vasdekis, Vassilis G. S.; Cagnone, Silvia; Moustaki, Irini

2012-01-01

The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate…

Multilevel and Latent Variable Modeling with Composite Links and Exploded Likelihoods

ERIC Educational Resources Information Center

Rabe-Hesketh, Sophia; Skrondal, Anders

2007-01-01

Composite links and exploded likelihoods are powerful yet simple tools for specifying a wide range of latent variable models. Applications considered include survival or duration models, models for rankings, small area estimation with census information, models for ordinal responses, item response models with guessing, randomized response models,…
Estimating residual fault hitting rates by recapture sampling

NASA Technical Reports Server (NTRS)

Lee, Larry; Gupta, Rajan

1988-01-01

For the recapture debugging design introduced by Nayak (1988) the problem of estimating the hitting rates of the faults remaining in the system is considered. In the context of a conditional likelihood, moment estimators are derived and are shown to be asymptotically normal and fully efficient. Fixed sample properties of the moment estimators are compared, through simulation, with those of the conditional maximum likelihood estimators. Properties of the conditional model are investigated such as the asymptotic distribution of linear functions of the fault hitting frequencies and a representation of the full data vector in terms of a sequence of independent random vectors. It is assumed that the residual hitting rates follow a log linear rate model and that the testing process is truncated when the gaps between the detection of new errors exceed a fixed amount of time.
Framework for adaptive multiscale analysis of nonhomogeneous point processes.

PubMed

Helgason, Hannes; Bartroff, Jay; Abry, Patrice

2011-01-01

We develop the methodology for hypothesis testing and model selection in nonhomogeneous Poisson processes, with an eye toward the application of modeling and variability detection in heart beat data. Modeling the process' non-constant rate function using templates of simple basis functions, we develop the generalized likelihood ratio statistic for a given template and a multiple testing scheme to model-select from a family of templates. A dynamic programming algorithm inspired by network flows is used to compute the maximum likelihood template in a multiscale manner. In a numerical example, the proposed procedure is nearly as powerful as the super-optimal procedures that know the true template size and true partition, respectively. Extensions to general history-dependent point processes is discussed.
Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables

PubMed Central

Rosenblum, Michael; van der Laan, Mark J.

2010-01-01

Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
Inverse problems-based maximum likelihood estimation of ground reflectivity for selected regions of interest from stripmap SAR data [Regularized maximum likelihood estimation of ground reflectivity from stripmap SAR data

DOE Office of Scientific and Technical Information (OSTI.GOV)

West, R. Derek; Gunther, Jacob H.; Moon, Todd K.

In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
Inverse problems-based maximum likelihood estimation of ground reflectivity for selected regions of interest from stripmap SAR data [Regularized maximum likelihood estimation of ground reflectivity from stripmap SAR data

DOE PAGES

West, R. Derek; Gunther, Jacob H.; Moon, Todd K.

2016-12-01

In this study, we derive a comprehensive forward model for the data collected by stripmap synthetic aperture radar (SAR) that is linear in the ground reflectivity parameters. It is also shown that if the noise model is additive, then the forward model fits into the linear statistical model framework, and the ground reflectivity parameters can be estimated by statistical methods. We derive the maximum likelihood (ML) estimates for the ground reflectivity parameters in the case of additive white Gaussian noise. Furthermore, we show that obtaining the ML estimates of the ground reflectivity requires two steps. The first step amounts tomore » a cross-correlation of the data with a model of the data acquisition parameters, and it is shown that this step has essentially the same processing as the so-called convolution back-projection algorithm. The second step is a complete system inversion that is capable of mitigating the sidelobes of the spatially variant impulse responses remaining after the correlation processing. We also state the Cramer-Rao lower bound (CRLB) for the ML ground reflectivity estimates.We show that the CRLB is linked to the SAR system parameters, the flight path of the SAR sensor, and the image reconstruction grid.We demonstrate the ML image formation and the CRLB bound for synthetically generated data.« less
Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.

PubMed

Han, Lei; Zhang, Yu; Zhang, Tong

2016-08-01

The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ 1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix. One motivation for this assumption is that the low-rank structure is common in many applications including the climate and financial analysis, and another one is that such assumption can reduce the computational complexity when computing its inverse. Specifically, we propose an efficient COmponent Pursuit (COP) method to obtain the low-rank part, where each component can be sparse. For optimization, the COP method greedily learns a rank-one component in each iteration by maximizing the log-likelihood. Moreover, the COP algorithm enjoys several appealing properties including the existence of an efficient solution in each iteration and the theoretical guarantee on the convergence of this greedy approach. Experiments on large-scale synthetic and real-world datasets including thousands of millions variables show that the COP method is faster than the state-of-the-art techniques for the inverse covariance estimation problem when achieving comparable log-likelihood on test data.
GPS Spoofing Attack Characterization and Detection in Smart Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blum, Rick S.; Pradhan, Parth; Nagananda, Kyatsandra

The problem of global positioning system (GPS) spoofing attacks on smart grids endowed with phasor measurement units (PMUs) is addressed, taking into account the dynamical behavior of the states of the system. First, it is shown how GPS spoofing introduces a timing synchronization error in the phasor readings recorded by the PMUs and alters the measurement matrix of the dynamical model. Then, a generalized likelihood ratio-based hypotheses testing procedure is devised to detect changes in the measurement matrix when the system is subjected to a spoofing attack. Monte Carlo simulations are performed on the 9-bus, 3-machine test grid to demonstratemore » the implication of the spoofing attack on dynamic state estimation and to analyze the performance of the proposed hypotheses test.« less
Program for Weibull Analysis of Fatigue Data

NASA Technical Reports Server (NTRS)

Krantz, Timothy L.

2005-01-01

A Fortran computer program has been written for performing statistical analyses of fatigue-test data that are assumed to be adequately represented by a two-parameter Weibull distribution. This program calculates the following: (1) Maximum-likelihood estimates of the Weibull distribution; (2) Data for contour plots of relative likelihood for two parameters; (3) Data for contour plots of joint confidence regions; (4) Data for the profile likelihood of the Weibull-distribution parameters; (5) Data for the profile likelihood of any percentile of the distribution; and (6) Likelihood-based confidence intervals for parameters and/or percentiles of the distribution. The program can account for tests that are suspended without failure (the statistical term for such suspension of tests is "censoring"). The analytical approach followed in this program for the software is valid for type-I censoring, which is the removal of unfailed units at pre-specified times. Confidence regions and intervals are calculated by use of the likelihood-ratio method.
PREDICTING CHRONIC LETHALITY OF CHEMICALS TO FISHES FROM ACUTE TOXICITY TEST DATA: THEORY OF ACCELERATED LIFE TESTING

EPA Science Inventory

A method for modeling aquatic toxicity date based on the theory of accelerated life testing and a procedure for maximum likelihood fitting the proposed model is presented. he procedure is computerized as software, which can predict chronic lethality of chemicals using data from a...
Chronic tinnitus resulting from cerumen removal procedures.

PubMed

Folmer, Robert L; Shi, Baker Yongbing

2004-01-01

This study was undertaken to determine how many cases of chronic tinnitus in a clinic population resulted from cerumen removal procedures and to summarize cerumen management methodologies and recommendations that will reduce the likelihood of such serious complications. Detailed questionnaires were mailed to 2400 consecutive patients (1704 male, 696 female; mean age, 53.3 +/- 11.8 years; age range, 7-87 years) prior to their initial appointment at the Oregon Health & Science University Tinnitus Clinic between 1986 and 2000. These questionnaires requested information about patients' medical, hearing, and tinnitus histories. Records were analyzed to determine how many patients reported that their chronic tinnitus began as a result of cerumen removal procedures. Of 2400 patients, 11 (0.46%) reported that their tinnitus began as a result of cerumen removal procedures performed by clinicians. Three additional patients reported that chronic tinnitus began as a result of their own attempts to clean their ear canals. Chronic and debilitating conditions, such as hearing loss and tinnitus, can occur as results of attempts to remove cerumen. By following the recommendations of experts in cerumen management techniques, clinicians can reduce the likelihood of catastrophic complications and subsequent litigation.
Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models

DOE PAGES

Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...

2014-10-16

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models

PubMed Central

Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.

2014-01-01

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. PMID:25329157
Optimal methods for fitting probability distributions to propagule retention time in studies of zoochorous dispersal.

PubMed

Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi

2016-02-01

Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.
Script-theory virtual case: A novel tool for education and research.

PubMed

Hayward, Jake; Cheung, Amandy; Velji, Alkarim; Altarejos, Jenny; Gill, Peter; Scarfe, Andrew; Lewis, Melanie

2016-11-01

Context/Setting: The script theory of diagnostic reasoning proposes that clinicians evaluate cases in the context of an "illness script," iteratively testing internal hypotheses against new information eventually reaching a diagnosis. We present a novel tool for teaching diagnostic reasoning to undergraduate medical students based on an adaptation of script theory. We developed a virtual patient case that used clinically authentic audio and video, interactive three-dimensional (3D) body images, and a simulated electronic medical record. Next, we used interactive slide bars to record respondents' likelihood estimates of diagnostic possibilities at various stages of the case. Responses were dynamically compared to data from expert clinicians and peers. Comparative frequency distributions were presented to the learner and final diagnostic likelihood estimates were analyzed. Detailed student feedback was collected. Over two academic years, 322 students participated. Student diagnostic likelihood estimates were similar year to year, but were consistently different from expert clinician estimates. Student feedback was overwhelmingly positive: students found the case was novel, innovative, clinically authentic, and a valuable learning experience. We demonstrate the successful implementation of a novel approach to teaching diagnostic reasoning. Future study may delineate reasoning processes associated with differences between novice and expert responses.
Semiparametric time-to-event modeling in the presence of a latent progression event.

PubMed

Rice, John D; Tsodikov, Alex

2017-06-01

In cancer research, interest frequently centers on factors influencing a latent event that must precede a terminal event. In practice it is often impossible to observe the latent event precisely, making inference about this process difficult. To address this problem, we propose a joint model for the unobserved time to the latent and terminal events, with the two events linked by the baseline hazard. Covariates enter the model parametrically as linear combinations that multiply, respectively, the hazard for the latent event and the hazard for the terminal event conditional on the latent one. We derive the partial likelihood estimators for this problem assuming the latent event is observed, and propose a profile likelihood-based method for estimation when the latent event is unobserved. The baseline hazard in this case is estimated nonparametrically using the EM algorithm, which allows for closed-form Breslow-type estimators at each iteration, bringing improved computational efficiency and stability compared with maximizing the marginal likelihood directly. We present simulation studies to illustrate the finite-sample properties of the method; its use in practice is demonstrated in the analysis of a prostate cancer data set. © 2016, The International Biometric Society.
Wald Sequential Probability Ratio Test for Analysis of Orbital Conjunction Data

NASA Technical Reports Server (NTRS)

Carpenter, J. Russell; Markley, F. Landis; Gold, Dara

2013-01-01

We propose a Wald Sequential Probability Ratio Test for analysis of commonly available predictions associated with spacecraft conjunctions. Such predictions generally consist of a relative state and relative state error covariance at the time of closest approach, under the assumption that prediction errors are Gaussian. We show that under these circumstances, the likelihood ratio of the Wald test reduces to an especially simple form, involving the current best estimate of collision probability, and a similar estimate of collision probability that is based on prior assumptions about the likelihood of collision.
A black box optimization approach to parameter estimation in a model for long/short term variations dynamics of commodity prices

NASA Astrophysics Data System (ADS)

De Santis, Alberto; Dellepiane, Umberto; Lucidi, Stefano

2012-11-01

In this paper we investigate the estimation problem for a model of the commodity prices. This model is a stochastic state space dynamical model and the problem unknowns are the state variables and the system parameters. Data are represented by the commodity spot prices, very seldom time series of Futures contracts are available for free. Both the system joint likelihood function (state variables and parameters) and the system marginal likelihood (the state variables are eliminated) function are addressed.
Proportion estimation using prior cluster purities

NASA Technical Reports Server (NTRS)

Terrell, G. R. (Principal Investigator)

1980-01-01

The prior distribution of CLASSY component purities is studied, and this information incorporated into maximum likelihood crop proportion estimators. The method is tested on Transition Year spring small grain segments.
Computing and software

USGS Publications Warehouse

White, Gary C.; Hines, J.E.

2004-01-01

The reality is that the statistical methods used for analysis of data depend upon the availability of software. Analysis of marked animal data is no different than the rest of the statistical field. The methods used for analysis are those that are available in reliable software packages. Thus, the critical importance of having reliable, up–to–date software available to biologists is obvious. Statisticians have continued to develop more robust models, ever expanding the suite of potential analysis methodsavailable. But without software to implement these newer methods, they will languish in the abstract, and not be applied to the problems deserving them.In the Computers and Software Session, two new software packages are described, a comparison of implementation of methods for the estimation of nest survival is provided, and a more speculative paper about how the next generation of software might be structured is presented.Rotella et al. (2004) compare nest survival estimation with different software packages: SAS logistic regression, SAS non–linear mixed models, and Program MARK. Nests are assumed to be visited at various, possibly infrequent, intervals. All of the approaches described compute nest survival with the same likelihood, and require that the age of the nest is known to account for nests that eventually hatch. However, each approach offers advantages and disadvantages, explored by Rotella et al. (2004).Efford et al. (2004) present a new software package called DENSITY. The package computes population abundance and density from trapping arrays and other detection methods with a new and unique approach. DENSITY represents the first major addition to the analysis of trapping arrays in 20 years.Barker & White (2004) discuss how existing software such as Program MARK require that each new model’s likelihood must be programmed specifically for that model. They wishfully think that future software might allow the user to combine pieces of likelihood functions together to generate estimates. The idea is interesting, and maybe some bright young statistician can work out the specifics to implement the procedure.Choquet et al. (2004) describe MSURGE, a software package that implements the multistate capture–recapture models. The unique feature of MSURGE is that the design matrix is constructed with an interpreted language called GEMACO. Because MSURGE is limited to just multistate models, the special requirements of these likelihoods can be provided.The software and methods presented in these papers gives biologists and wildlife managers an expanding range of possibilities for data analysis. Although ease–of–use is generally getting better, it does not replace the need for understanding of the requirements and structure of the models being computed. The internet provides access to many free software packages as well as user–discussion groups to share knowledge and ideas. (A starting point for wildlife–related applications is (http://www.phidot.org).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.