Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Robustness of location estimators under t-distributions: a literature review
NASA Astrophysics Data System (ADS)
Sumarni, C.; Sadik, K.; Notodiputro, K. A.; Sartono, B.
2017-03-01
The assumption of normality is commonly used in estimation of parameters in statistical modelling, but this assumption is very sensitive to outliers. The t-distribution is more robust than the normal distribution since the t-distributions have longer tails. The robustness measures of location estimators under t-distributions are reviewed and discussed in this paper. For the purpose of illustration we use the onion yield data which includes outliers as a case study and showed that the t model produces better fit than the normal model.
On the robustness of a Bayes estimate. [in reliability theory
NASA Technical Reports Server (NTRS)
Canavos, G. C.
1974-01-01
This paper examines the robustness of a Bayes estimator with respect to the assigned prior distribution. A Bayesian analysis for a stochastic scale parameter of a Weibull failure model is summarized in which the natural conjugate is assigned as the prior distribution of the random parameter. The sensitivity analysis is carried out by the Monte Carlo method in which, although an inverted gamma is the assigned prior, realizations are generated using distribution functions of varying shape. For several distributional forms and even for some fixed values of the parameter, simulated mean squared errors of Bayes and minimum variance unbiased estimators are determined and compared. Results indicate that the Bayes estimator remains squared-error superior and appears to be largely robust to the form of the assigned prior distribution.
Robust inference in the negative binomial regression model with an application to falls data.
Aeberhard, William H; Cantoni, Eva; Heritier, Stephane
2014-12-01
A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.
Generating Multivariate Ordinal Data via Entropy Principles.
Lee, Yen; Kaplan, David
2018-03-01
When conducting robustness research where the focus of attention is on the impact of non-normality, the marginal skewness and kurtosis are often used to set the degree of non-normality. Monte Carlo methods are commonly applied to conduct this type of research by simulating data from distributions with skewness and kurtosis constrained to pre-specified values. Although several procedures have been proposed to simulate data from distributions with these constraints, no corresponding procedures have been applied for discrete distributions. In this paper, we present two procedures based on the principles of maximum entropy and minimum cross-entropy to estimate the multivariate observed ordinal distributions with constraints on skewness and kurtosis. For these procedures, the correlation matrix of the observed variables is not specified but depends on the relationships between the latent response variables. With the estimated distributions, researchers can study robustness not only focusing on the levels of non-normality but also on the variations in the distribution shapes. A simulation study demonstrates that these procedures yield excellent agreement between specified parameters and those of estimated distributions. A robustness study concerning the effect of distribution shape in the context of confirmatory factor analysis shows that shape can affect the robust [Formula: see text] and robust fit indices, especially when the sample size is small, the data are severely non-normal, and the fitted model is complex.
Diagnostics of Robust Growth Curve Modeling Using Student's "t" Distribution
ERIC Educational Resources Information Center
Tong, Xin; Zhang, Zhiyong
2012-01-01
Growth curve models with different types of distributions of random effects and of intraindividual measurement errors for robust analysis are compared. After demonstrating the influence of distribution specification on parameter estimation, 3 methods for diagnosing the distributions for both random effects and intraindividual measurement errors…
NASA Astrophysics Data System (ADS)
Shariff, Nurul Sima Mohamad; Ferdaos, Nur Aqilah
2017-08-01
Multicollinearity often leads to inconsistent and unreliable parameter estimates in regression analysis. This situation will be more severe in the presence of outliers it will cause fatter tails in the error distributions than the normal distributions. The well-known procedure that is robust to multicollinearity problem is the ridge regression method. This method however is expected to be affected by the presence of outliers due to some assumptions imposed in the modeling procedure. Thus, the robust version of existing ridge method with some modification in the inverse matrix and the estimated response value is introduced. The performance of the proposed method is discussed and comparisons are made with several existing estimators namely, Ordinary Least Squares (OLS), ridge regression and robust ridge regression based on GM-estimates. The finding of this study is able to produce reliable parameter estimates in the presence of both multicollinearity and outliers in the data.
Robust Magnetotelluric Impedance Estimation
NASA Astrophysics Data System (ADS)
Sutarno, D.
2010-12-01
Robust magnetotelluric (MT) response function estimators are now in standard use by the induction community. Properly devised and applied, these have ability to reduce the influence of unusual data (outliers). The estimators always yield impedance estimates which are better than the conventional least square (LS) estimation because the `real' MT data almost never satisfy the statistical assumptions of Gaussian distribution and stationary upon which normal spectral analysis is based. This paper discuses the development and application of robust estimation procedures which can be classified as M-estimators to MT data. Starting with the description of the estimators, special attention is addressed to the recent development of a bounded-influence robust estimation, including utilization of the Hilbert Transform (HT) operation on causal MT impedance functions. The resulting robust performances are illustrated using synthetic as well as real MT data.
NASA Astrophysics Data System (ADS)
Tugores, M. Pilar; Iglesias, Magdalena; Oñate, Dolores; Miquel, Joan
2016-02-01
In the Mediterranean Sea, the European anchovy (Engraulis encrasicolus) displays a key role in ecological and economical terms. Ensuring stock sustainability requires the provision of crucial information, such as species spatial distribution or unbiased abundance and precision estimates, so that management strategies can be defined (e.g. fishing quotas, temporal closure areas or marine protected areas MPA). Furthermore, the estimation of the precision of global abundance at different sampling intensities can be used for survey design optimisation. Geostatistics provide a priori unbiased estimations of the spatial structure, global abundance and precision for autocorrelated data. However, their application to non-Gaussian data introduces difficulties in the analysis in conjunction with low robustness or unbiasedness. The present study applied intrinsic geostatistics in two dimensions in order to (i) analyse the spatial distribution of anchovy in Spanish Western Mediterranean waters during the species' recruitment season, (ii) produce distribution maps, (iii) estimate global abundance and its precision, (iv) analyse the effect of changing the sampling intensity on the precision of global abundance estimates and, (v) evaluate the effects of several methodological options on the robustness of all the analysed parameters. The results suggested that while the spatial structure was usually non-robust to the tested methodological options when working with the original dataset, it became more robust for the transformed datasets (especially for the log-backtransformed dataset). The global abundance was always highly robust and the global precision was highly or moderately robust to most of the methodological options, except for data transformation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Ping; Lv, Youbin; Wang, Hong
Optimal operation of a practical blast furnace (BF) ironmaking process depends largely on a good measurement of molten iron quality (MIQ) indices. However, measuring the MIQ online is not feasible using the available techniques. In this paper, a novel data-driven robust modeling is proposed for online estimation of MIQ using improved random vector functional-link networks (RVFLNs). Since the output weights of traditional RVFLNs are obtained by the least squares approach, a robustness problem may occur when the training dataset is contaminated with outliers. This affects the modeling accuracy of RVFLNs. To solve this problem, a Cauchy distribution weighted M-estimation basedmore » robust RFVLNs is proposed. Since the weights of different outlier data are properly determined by the Cauchy distribution, their corresponding contribution on modeling can be properly distinguished. Thus robust and better modeling results can be achieved. Moreover, given that the BF is a complex nonlinear system with numerous coupling variables, the data-driven canonical correlation analysis is employed to identify the most influential components from multitudinous factors that affect the MIQ indices to reduce the model dimension. Finally, experiments using industrial data and comparative studies have demonstrated that the obtained model produces a better modeling and estimating accuracy and stronger robustness than other modeling methods.« less
Bayesian Inference and Application of Robust Growth Curve Models Using Student's "t" Distribution
ERIC Educational Resources Information Center
Zhang, Zhiyong; Lai, Keke; Lu, Zhenqiu; Tong, Xin
2013-01-01
Despite the widespread popularity of growth curve analysis, few studies have investigated robust growth curve models. In this article, the "t" distribution is applied to model heavy-tailed data and contaminated normal data with outliers for growth curve analysis. The derived robust growth curve models are estimated through Bayesian…
Robust Alternatives to the Standard Deviation in Processing of Physics Experimental Data
NASA Astrophysics Data System (ADS)
Shulenin, V. P.
2016-10-01
Properties of robust estimations of the scale parameter are studied. It is noted that the median of absolute deviations and the modified estimation of the average Gini differences have asymptotically normal distributions and bounded influence functions, are B-robust estimations, and hence, unlike the estimation of the standard deviation, are protected from the presence of outliers in the sample. Results of comparison of estimations of the scale parameter are given for a Gaussian model with contamination. An adaptive variant of the modified estimation of the average Gini differences is considered.
NASA Astrophysics Data System (ADS)
Martinsson, J.
2013-03-01
We propose methods for robust Bayesian inference of the hypocentre in presence of poor, inconsistent and insufficient phase arrival times. The objectives are to increase the robustness, the accuracy and the precision by introducing heavy-tailed distributions and an informative prior distribution of the seismicity. The effects of the proposed distributions are studied under real measurement conditions in two underground mine networks and validated using 53 blasts with known hypocentres. To increase the robustness against poor, inconsistent or insufficient arrivals, a Gaussian Mixture Model is used as a hypocentre prior distribution to describe the seismically active areas, where the parameters are estimated based on previously located events in the region. The prior is truncated to constrain the solution to valid geometries, for example below the ground surface, excluding known cavities, voids and fractured zones. To reduce the sensitivity to outliers, different heavy-tailed distributions are evaluated to model the likelihood distribution of the arrivals given the hypocentre and the origin time. Among these distributions, the multivariate t-distribution is shown to produce the overall best performance, where the tail-mass adapts to the observed data. Hypocentre and uncertainty region estimates are based on simulations from the posterior distribution using Markov Chain Monte Carlo techniques. Velocity graphs (equivalent to traveltime graphs) are estimated using blasts from known locations, and applied to reduce the main uncertainties and thereby the final estimation error. To focus on the behaviour and the performance of the proposed distributions, a basic single-event Bayesian procedure is considered in this study for clarity. Estimation results are shown with different distributions, with and without prior distribution of seismicity, with wrong prior distribution, with and without error compensation, with and without error description, with insufficient arrival times and in presence of significant outliers. A particular focus is on visual results and comparisons to give a better understanding of the Bayesian advantage and to show the effects of heavy-tailed distributions and informative prior information on real data.
NASA Astrophysics Data System (ADS)
Rock, N. M. S.
ROBUST calculates 53 statistics, plus significance levels for 6 hypothesis tests, on each of up to 52 variables. These together allow the following properties of the data distribution for each variable to be examined in detail: (1) Location. Three means (arithmetic, geometric, harmonic) are calculated, together with the midrange and 19 high-performance robust L-, M-, and W-estimates of location (combined, adaptive, trimmed estimates, etc.) (2) Scale. The standard deviation is calculated along with the H-spread/2 (≈ semi-interquartile range), the mean and median absolute deviations from both mean and median, and a biweight scale estimator. The 23 location and 6 scale estimators programmed cover all possible degrees of robustness. (3) Normality: Distributions are tested against the null hypothesis that they are normal, using the 3rd (√ h1) and 4th ( b 2) moments, Geary's ratio (mean deviation/standard deviation), Filliben's probability plot correlation coefficient, and a more robust test based on the biweight scale estimator. These statistics collectively are sensitive to most usual departures from normality. (4) Presence of outliers. The maximum and minimum values are assessed individually or jointly using Grubbs' maximum Studentized residuals, Harvey's and Dixon's criteria, and the Studentized range. For a single input variable, outliers can be either winsorized or eliminated and all estimates recalculated iteratively as desired. The following data-transformations also can be applied: linear, log 10, generalized Box Cox power (including log, reciprocal, and square root), exponentiation, and standardization. For more than one variable, all results are tabulated in a single run of ROBUST. Further options are incorporated to assess ratios (of two variables) as well as discrete variables, and be concerned with missing data. Cumulative S-plots (for assessing normality graphically) also can be generated. The mutual consistency or inconsistency of all these measures helps to detect errors in data as well as to assess data-distributions themselves.
Archambeau, Cédric; Verleysen, Michel
2007-01-01
A new variational Bayesian learning algorithm for Student-t mixture models is introduced. This algorithm leads to (i) robust density estimation, (ii) robust clustering and (iii) robust automatic model selection. Gaussian mixture models are learning machines which are based on a divide-and-conquer approach. They are commonly used for density estimation and clustering tasks, but are sensitive to outliers. The Student-t distribution has heavier tails than the Gaussian distribution and is therefore less sensitive to any departure of the empirical distribution from Gaussianity. As a consequence, the Student-t distribution is suitable for constructing robust mixture models. In this work, we formalize the Bayesian Student-t mixture model as a latent variable model in a different way from Svensén and Bishop [Svensén, M., & Bishop, C. M. (2005). Robust Bayesian mixture modelling. Neurocomputing, 64, 235-252]. The main difference resides in the fact that it is not necessary to assume a factorized approximation of the posterior distribution on the latent indicator variables and the latent scale variables in order to obtain a tractable solution. Not neglecting the correlations between these unobserved random variables leads to a Bayesian model having an increased robustness. Furthermore, it is expected that the lower bound on the log-evidence is tighter. Based on this bound, the model complexity, i.e. the number of components in the mixture, can be inferred with a higher confidence.
Robust and efficient estimation with weighted composite quantile regression
NASA Astrophysics Data System (ADS)
Jiang, Xuejun; Li, Jingzhi; Xia, Tian; Yan, Wanfeng
2016-09-01
In this paper we introduce a weighted composite quantile regression (CQR) estimation approach and study its application in nonlinear models such as exponential models and ARCH-type models. The weighted CQR is augmented by using a data-driven weighting scheme. With the error distribution unspecified, the proposed estimators share robustness from quantile regression and achieve nearly the same efficiency as the oracle maximum likelihood estimator (MLE) for a variety of error distributions including the normal, mixed-normal, Student's t, Cauchy distributions, etc. We also suggest an algorithm for the fast implementation of the proposed methodology. Simulations are carried out to compare the performance of different estimators, and the proposed approach is used to analyze the daily S&P 500 Composite index, which verifies the effectiveness and efficiency of our theoretical results.
Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo
2018-01-01
This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
A Robust Approach to Risk Assessment Based on Species Sensitivity Distributions.
Monti, Gianna S; Filzmoser, Peter; Deutsch, Roland C
2018-05-03
The guidelines for setting environmental quality standards are increasingly based on probabilistic risk assessment due to a growing general awareness of the need for probabilistic procedures. One of the commonly used tools in probabilistic risk assessment is the species sensitivity distribution (SSD), which represents the proportion of species affected belonging to a biological assemblage as a function of exposure to a specific toxicant. Our focus is on the inverse use of the SSD curve with the aim of estimating the concentration, HCp, of a toxic compound that is hazardous to p% of the biological community under study. Toward this end, we propose the use of robust statistical methods in order to take into account the presence of outliers or apparent skew in the data, which may occur without any ecological basis. A robust approach exploits the full neighborhood of a parametric model, enabling the analyst to account for the typical real-world deviations from ideal models. We examine two classic HCp estimation approaches and consider robust versions of these estimators. In addition, we also use data transformations in conjunction with robust estimation methods in case of heteroscedasticity. Different scenarios using real data sets as well as simulated data are presented in order to illustrate and compare the proposed approaches. These scenarios illustrate that the use of robust estimation methods enhances HCp estimation. © 2018 Society for Risk Analysis.
NASA Astrophysics Data System (ADS)
Girinoto, Sadik, Kusman; Indahwati
2017-03-01
The National Socio-Economic Survey samples are designed to produce estimates of parameters of planned domains (provinces and districts). The estimation of unplanned domains (sub-districts and villages) has its limitation to obtain reliable direct estimates. One of the possible solutions to overcome this problem is employing small area estimation techniques. The popular choice of small area estimation is based on linear mixed models. However, such models need strong distributional assumptions and do not easy allow for outlier-robust estimation. As an alternative approach for this purpose, M-quantile regression approach to small area estimation based on modeling specific M-quantile coefficients of conditional distribution of study variable given auxiliary covariates. It obtained outlier-robust estimation from influence function of M-estimator type and also no need strong distributional assumptions. In this paper, the aim of study is to estimate the poverty indicator at sub-district level in Bogor District-West Java using M-quantile models for small area estimation. Using data taken from National Socioeconomic Survey and Villages Potential Statistics, the results provide a detailed description of pattern of incidence and intensity of poverty within Bogor district. We also compare the results with direct estimates. The results showed the framework may be preferable when direct estimate having no incidence of poverty at all in the small area.
ERIC Educational Resources Information Center
Rhemtulla, Mijke; Brosseau-Liard, Patricia E.; Savalei, Victoria
2012-01-01
A simulation study compared the performance of robust normal theory maximum likelihood (ML) and robust categorical least squares (cat-LS) methodology for estimating confirmatory factor analysis models with ordinal variables. Data were generated from 2 models with 2-7 categories, 4 sample sizes, 2 latent distributions, and 5 patterns of category…
Robustness of fit indices to outliers and leverage observations in structural equation modeling.
Yuan, Ke-Hai; Zhong, Xiaoling
2013-06-01
Normal-distribution-based maximum likelihood (NML) is the most widely used method in structural equation modeling (SEM), although practical data tend to be nonnormally distributed. The effect of nonnormally distributed data or data contamination on the normal-distribution-based likelihood ratio (LR) statistic is well understood due to many analytical and empirical studies. In SEM, fit indices are used as widely as the LR statistic. In addition to NML, robust procedures have been developed for more efficient and less biased parameter estimates with practical data. This article studies the effect of outliers and leverage observations on fit indices following NML and two robust methods. Analysis and empirical results indicate that good leverage observations following NML and one of the robust methods lead most fit indices to give more support to the substantive model. While outliers tend to make a good model superficially bad according to many fit indices following NML, they have little effect on those following the two robust procedures. Implications of the results to data analysis are discussed, and recommendations are provided regarding the use of estimation methods and interpretation of fit indices. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Model Uncertainty and Robustness: A Computational Framework for Multimodel Analysis
ERIC Educational Resources Information Center
Young, Cristobal; Holsteen, Katherine
2017-01-01
Model uncertainty is pervasive in social science. A key question is how robust empirical results are to sensible changes in model specification. We present a new approach and applied statistical software for computational multimodel analysis. Our approach proceeds in two steps: First, we estimate the modeling distribution of estimates across all…
Robust geostatistical analysis of spatial data
NASA Astrophysics Data System (ADS)
Papritz, A.; Künsch, H. R.; Schwierz, C.; Stahel, W. A.
2012-04-01
Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outlying observations may results from errors (e.g. in data transcription) or from local perturbations in the processes that are responsible for a given pattern of spatial variation. As an example, the spatial distribution of some trace metal in the soils of a region may be distorted by emissions of local anthropogenic sources. Outliers affect the modelling of the large-scale spatial variation, the so-called external drift or trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) [2] proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) [1] for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation. Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and unsampled locations and kriging variances. The method has been implemented in an R package. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis of the Tarrawarra soil moisture data set [3].
Chou, C P; Bentler, P M; Satorra, A
1991-11-01
Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.
Tsiatis, Anastasios A.; Davidian, Marie; Cao, Weihua
2010-01-01
Summary A routine challenge is that of making inference on parameters in a statistical model of interest from longitudinal data subject to drop out, which are a special case of the more general setting of monotonely coarsened data. Considerable recent attention has focused on doubly robust estimators, which in this context involve positing models for both the missingness (more generally, coarsening) mechanism and aspects of the distribution of the full data, that have the appealing property of yielding consistent inferences if only one of these models is correctly specified. Doubly robust estimators have been criticized for potentially disastrous performance when both of these models are even only mildly misspecified. We propose a doubly robust estimator applicable in general monotone coarsening problems that achieves comparable or improved performance relative to existing doubly robust methods, which we demonstrate via simulation studies and by application to data from an AIDS clinical trial. PMID:20731640
Garcia, Tanya P; Ma, Yanyuan
2017-10-01
We develop consistent and efficient estimation of parameters in general regression models with mismeasured covariates. We assume the model error and covariate distributions are unspecified, and the measurement error distribution is a general parametric distribution with unknown variance-covariance. We construct root- n consistent, asymptotically normal and locally efficient estimators using the semiparametric efficient score. We do not estimate any unknown distribution or model error heteroskedasticity. Instead, we form the estimator under possibly incorrect working distribution models for the model error, error-prone covariate, or both. Empirical results demonstrate robustness to different incorrect working models in homoscedastic and heteroskedastic models with error-prone covariates.
Chasing the peak: optimal statistics for weak shear analyses
NASA Astrophysics Data System (ADS)
Smit, Merijn; Kuijken, Konrad
2018-01-01
Context. Weak gravitational lensing analyses are fundamentally limited by the intrinsic distribution of galaxy shapes. It is well known that this distribution of galaxy ellipticity is non-Gaussian, and the traditional estimation methods, explicitly or implicitly assuming Gaussianity, are not necessarily optimal. Aims: We aim to explore alternative statistics for samples of ellipticity measurements. An optimal estimator needs to be asymptotically unbiased, efficient, and robust in retaining these properties for various possible sample distributions. We take the non-linear mapping of gravitational shear and the effect of noise into account. We then discuss how the distribution of individual galaxy shapes in the observed field of view can be modeled by fitting Fourier modes to the shear pattern directly. This allows scientific analyses using statistical information of the whole field of view, instead of locally sparse and poorly constrained estimates. Methods: We simulated samples of galaxy ellipticities, using both theoretical distributions and data for ellipticities and noise. We determined the possible bias Δe, the efficiency η and the robustness of the least absolute deviations, the biweight, and the convex hull peeling (CHP) estimators, compared to the canonical weighted mean. Using these statistics for regression, we have shown the applicability of direct Fourier mode fitting. Results: We find an improved performance of all estimators, when iteratively reducing the residuals after de-shearing the ellipticity samples by the estimated shear, which removes the asymmetry in the ellipticity distributions. We show that these estimators are then unbiased in the absence of noise, and decrease noise bias by more than 30%. Our results show that the CHP estimator distribution is skewed, but still centered around the underlying shear, and its bias least affected by noise. We find the least absolute deviations estimator to be the most efficient estimator in almost all cases, except in the Gaussian case, where it's still competitive (0.83 < η < 5.1) and therefore robust. These results hold when fitting Fourier modes, where amplitudes of variation in ellipticity are determined to the order of 10-3. Conclusions: The peak of the ellipticity distribution is a direct tracer of the underlying shear and unaffected by noise, and we have shown that estimators that are sensitive to a central cusp perform more efficiently, potentially reducing uncertainties by more 0% and significantly decreasing noise bias. These results become increasingly important, as survey sizes increase and systematic issues in shape measurements decrease.
New spatial upscaling methods for multi-point measurements: From normal to p-normal
NASA Astrophysics Data System (ADS)
Liu, Feng; Li, Xin
2017-12-01
Careful attention must be given to determining whether the geophysical variables of interest are normally distributed, since the assumption of a normal distribution may not accurately reflect the probability distribution of some variables. As a generalization of the normal distribution, the p-normal distribution and its corresponding maximum likelihood estimation (the least power estimation, LPE) were introduced in upscaling methods for multi-point measurements. Six methods, including three normal-based methods, i.e., arithmetic average, least square estimation, block kriging, and three p-normal-based methods, i.e., LPE, geostatistics LPE and inverse distance weighted LPE are compared in two types of experiments: a synthetic experiment to evaluate the performance of the upscaling methods in terms of accuracy, stability and robustness, and a real-world experiment to produce real-world upscaling estimates using soil moisture data obtained from multi-scale observations. The results show that the p-normal-based methods produced lower mean absolute errors and outperformed the other techniques due to their universality and robustness. We conclude that introducing appropriate statistical parameters into an upscaling strategy can substantially improve the estimation, especially if the raw measurements are disorganized; however, further investigation is required to determine which parameter is the most effective among variance, spatial correlation information and parameter p.
Robust geostatistical analysis of spatial data
NASA Astrophysics Data System (ADS)
Papritz, Andreas; Künsch, Hans Rudolf; Schwierz, Cornelia; Stahel, Werner A.
2013-04-01
Most of the geostatistical software tools rely on non-robust algorithms. This is unfortunate, because outlying observations are rather the rule than the exception, in particular in environmental data sets. Outliers affect the modelling of the large-scale spatial trend, the estimation of the spatial dependence of the residual variation and the predictions by kriging. Identifying outliers manually is cumbersome and requires expertise because one needs parameter estimates to decide which observation is a potential outlier. Moreover, inference after the rejection of some observations is problematic. A better approach is to use robust algorithms that prevent automatically that outlying observations have undue influence. Former studies on robust geostatistics focused on robust estimation of the sample variogram and ordinary kriging without external drift. Furthermore, Richardson and Welsh (1995) proposed a robustified version of (restricted) maximum likelihood ([RE]ML) estimation for the variance components of a linear mixed model, which was later used by Marchant and Lark (2007) for robust REML estimation of the variogram. We propose here a novel method for robust REML estimation of the variogram of a Gaussian random field that is possibly contaminated by independent errors from a long-tailed distribution. It is based on robustification of estimating equations for the Gaussian REML estimation (Welsh and Richardson, 1997). Besides robust estimates of the parameters of the external drift and of the variogram, the method also provides standard errors for the estimated parameters, robustified kriging predictions at both sampled and non-sampled locations and kriging variances. Apart from presenting our modelling framework, we shall present selected simulation results by which we explored the properties of the new method. This will be complemented by an analysis a data set on heavy metal contamination of the soil in the vicinity of a metal smelter. Marchant, B.P. and Lark, R.M. 2007. Robust estimation of the variogram by residual maximum likelihood. Geoderma 140: 62-72. Richardson, A.M. and Welsh, A.H. 1995. Robust restricted maximum likelihood in mixed linear models. Biometrics 51: 1429-1439. Welsh, A.H. and Richardson, A.M. 1997. Approaches to the robust estimation of mixed models. In: Handbook of Statistics Vol. 15, Elsevier, pp. 343-384.
Zhang, Zhiyong; Yuan, Ke-Hai
2016-06-01
Cronbach's coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald's omega has been used as a popular alternative to alpha in the literature. Traditional estimation methods for alpha and omega often implicitly assume that data are complete and normally distributed. This study proposes robust procedures to estimate both alpha and omega as well as corresponding standard errors and confidence intervals from samples that may contain potential outlying observations and missing values. The influence of outlying observations and missing data on the estimates of alpha and omega is investigated through two simulation studies. Results show that the newly developed robust method yields substantially improved alpha and omega estimates as well as better coverage rates of confidence intervals than the conventional nonrobust method. An R package coefficientalpha is developed and demonstrated to obtain robust estimates of alpha and omega.
Zhang, Zhiyong; Yuan, Ke-Hai
2015-01-01
Cronbach’s coefficient alpha is a widely used reliability measure in social, behavioral, and education sciences. It is reported in nearly every study that involves measuring a construct through multiple items. With non-tau-equivalent items, McDonald’s omega has been used as a popular alternative to alpha in the literature. Traditional estimation methods for alpha and omega often implicitly assume that data are complete and normally distributed. This study proposes robust procedures to estimate both alpha and omega as well as corresponding standard errors and confidence intervals from samples that may contain potential outlying observations and missing values. The influence of outlying observations and missing data on the estimates of alpha and omega is investigated through two simulation studies. Results show that the newly developed robust method yields substantially improved alpha and omega estimates as well as better coverage rates of confidence intervals than the conventional nonrobust method. An R package coefficientalpha is developed and demonstrated to obtain robust estimates of alpha and omega. PMID:29795870
A robust nonlinear filter for image restoration.
Koivunen, V
1995-01-01
A class of nonlinear regression filters based on robust estimation theory is introduced. The goal of the filtering is to recover a high-quality image from degraded observations. Models for desired image structures and contaminating processes are employed, but deviations from strict assumptions are allowed since the assumptions on signal and noise are typically only approximately true. The robustness of filters is usually addressed only in a distributional sense, i.e., the actual error distribution deviates from the nominal one. In this paper, the robustness is considered in a broad sense since the outliers may also be due to inappropriate signal model, or there may be more than one statistical population present in the processing window, causing biased estimates. Two filtering algorithms minimizing a least trimmed squares criterion are provided. The design of the filters is simple since no scale parameters or context-dependent threshold values are required. Experimental results using both real and simulated data are presented. The filters effectively attenuate both impulsive and nonimpulsive noise while recovering the signal structure and preserving interesting details.
Pandiselvi, S; Raja, R; Cao, Jinde; Rajchakit, G; Ahmad, Bashir
2018-01-01
This work predominantly labels the problem of approximation of state variables for discrete-time stochastic genetic regulatory networks with leakage, distributed, and probabilistic measurement delays. Here we design a linear estimator in such a way that the absorption of mRNA and protein can be approximated via known measurement outputs. By utilizing a Lyapunov-Krasovskii functional and some stochastic analysis execution, we obtain the stability formula of the estimation error systems in the structure of linear matrix inequalities under which the estimation error dynamics is robustly exponentially stable. Further, the obtained conditions (in the form of LMIs) can be effortlessly solved by some available software packages. Moreover, the specific expression of the desired estimator is also shown in the main section. Finally, two mathematical illustrative examples are accorded to show the advantage of the proposed conceptual results.
Inference on the Ranks of the Canonical Correlation Matrices for Elliptically Symmetric Populations.
1985-05-01
robust estimates of the covariance matrix, the reader is referred to Devlin, Gnanadesikan and Kettenring (1975) and Maronna (1976). Murihead and...contoured distributions. J. Multivariate Anal. 11, 368-385. 6. DEVLIN, S.J. GNANADESIKAN , R. and KETTENRING, J. (1975). Robust estima- tion and outlier
A robust bayesian estimate of the concordance correlation coefficient.
Feng, Dai; Baumgartner, Richard; Svetnik, Vladimir
2015-01-01
A need for assessment of agreement arises in many situations including statistical biomarker qualification or assay or method validation. Concordance correlation coefficient (CCC) is one of the most popular scaled indices reported in evaluation of agreement. Robust methods for CCC estimation currently present an important statistical challenge. Here, we propose a novel Bayesian method of robust estimation of CCC based on multivariate Student's t-distribution and compare it with its alternatives. Furthermore, we extend the method to practically relevant settings, enabling incorporation of confounding covariates and replications. The superiority of the new approach is demonstrated using simulation as well as real datasets from biomarker application in electroencephalography (EEG). This biomarker is relevant in neuroscience for development of treatments for insomnia.
NASA Astrophysics Data System (ADS)
Kargoll, Boris; Omidalizarandi, Mohammad; Loth, Ina; Paffenholz, Jens-André; Alkhatib, Hamza
2018-03-01
In this paper, we investigate a linear regression time series model of possibly outlier-afflicted observations and autocorrelated random deviations. This colored noise is represented by a covariance-stationary autoregressive (AR) process, in which the independent error components follow a scaled (Student's) t-distribution. This error model allows for the stochastic modeling of multiple outliers and for an adaptive robust maximum likelihood (ML) estimation of the unknown regression and AR coefficients, the scale parameter, and the degree of freedom of the t-distribution. This approach is meant to be an extension of known estimators, which tend to focus only on the regression model, or on the AR error model, or on normally distributed errors. For the purpose of ML estimation, we derive an expectation conditional maximization either algorithm, which leads to an easy-to-implement version of iteratively reweighted least squares. The estimation performance of the algorithm is evaluated via Monte Carlo simulations for a Fourier as well as a spline model in connection with AR colored noise models of different orders and with three different sampling distributions generating the white noise components. We apply the algorithm to a vibration dataset recorded by a high-accuracy, single-axis accelerometer, focusing on the evaluation of the estimated AR colored noise model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bender, Edward T.
Purpose: To develop a robust method for deriving dose-painting prescription functions using spatial information about the risk for disease recurrence. Methods: Spatial distributions of radiobiological model parameters are derived from distributions of recurrence risk after uniform irradiation. These model parameters are then used to derive optimal dose-painting prescription functions given a constant mean biologically effective dose. Results: An estimate for the optimal dose distribution can be derived based on spatial information about recurrence risk. Dose painting based on imaging markers that are moderately or poorly correlated with recurrence risk are predicted to potentially result in inferior disease control when comparedmore » the same mean biologically effective dose delivered uniformly. A robust optimization approach may partially mitigate this issue. Conclusions: The methods described here can be used to derive an estimate for a robust, patient-specific prescription function for use in dose painting. Two approximate scaling relationships were observed: First, the optimal choice for the maximum dose differential when using either a linear or two-compartment prescription function is proportional to R, where R is the Pearson correlation coefficient between a given imaging marker and recurrence risk after uniform irradiation. Second, the predicted maximum possible gain in tumor control probability for any robust optimization technique is nearly proportional to the square of R.« less
Robust Modal Filtering and Control of the X-56A Model with Simulated Fiber Optic Sensor Failures
NASA Technical Reports Server (NTRS)
Suh, Peter M.; Chin, Alexander W.; Marvis, Dimitri N.
2014-01-01
The X-56A aircraft is a remotely-piloted aircraft with flutter modes intentionally designed into the flight envelope. The X-56A program must demonstrate flight control while suppressing all unstable modes. A previous X-56A model study demonstrated a distributed-sensing-based active shape and active flutter suppression controller. The controller relies on an estimator which is sensitive to bias. This estimator is improved herein, and a real-time robust estimator is derived and demonstrated on 1530 fiber optic sensors. It is shown in simulation that the estimator can simultaneously reject 230 worst-case fiber optic sensor failures automatically. These sensor failures include locations with high leverage (or importance). To reduce the impact of leverage outliers, concentration based on a Mahalanobis trim criterion is introduced. A redescending M-estimator with Tukey bisquare weights is used to improve location and dispersion estimates within each concentration step in the presence of asymmetry (or leverage). A dynamic simulation is used to compare the concentrated robust estimator to a state-of-the-art real-time robust multivariate estimator. The estimators support a previously-derived mu-optimal shape controller. It is found that during the failure scenario, the concentrated modal estimator keeps the system stable.
Robust Modal Filtering and Control of the X-56A Model with Simulated Fiber Optic Sensor Failures
NASA Technical Reports Server (NTRS)
Suh, Peter M.; Chin, Alexander W.; Mavris, Dimitri N.
2016-01-01
The X-56A aircraft is a remotely-piloted aircraft with flutter modes intentionally designed into the flight envelope. The X-56A program must demonstrate flight control while suppressing all unstable modes. A previous X-56A model study demonstrated a distributed-sensing-based active shape and active flutter suppression controller. The controller relies on an estimator which is sensitive to bias. This estimator is improved herein, and a real-time robust estimator is derived and demonstrated on 1530 fiber optic sensors. It is shown in simulation that the estimator can simultaneously reject 230 worst-case fiber optic sensor failures automatically. These sensor failures include locations with high leverage (or importance). To reduce the impact of leverage outliers, concentration based on a Mahalanobis trim criterion is introduced. A redescending M-estimator with Tukey bisquare weights is used to improve location and dispersion estimates within each concentration step in the presence of asymmetry (or leverage). A dynamic simulation is used to compare the concentrated robust estimator to a state-of-the-art real-time robust multivariate estimator. The estimators support a previously-derived mu-optimal shape controller. It is found that during the failure scenario, the concentrated modal estimator keeps the system stable.
Regression estimators for generic health-related quality of life and quality-adjusted life years.
Basu, Anirban; Manca, Andrea
2012-01-01
To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.
Li, Haojie; Graham, Daniel J
2016-08-01
This paper estimates the causal effect of 20mph zones on road casualties in London. Potential confounders in the key relationship of interest are included within outcome regression and propensity score models, and the models are then combined to form a doubly robust estimator. A total of 234 treated zones and 2844 potential control zones are included in the data sample. The propensity score model is used to select a viable control group which has common support in the covariate distributions. We compare the doubly robust estimates with those obtained using three other methods: inverse probability weighting, regression adjustment, and propensity score matching. The results indicate that 20mph zones have had a significant causal impact on road casualty reduction in both absolute and proportional terms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Efficient robust doubly adaptive regularized regression with applications.
Karunamuni, Rohana J; Kong, Linglong; Tu, Wei
2018-01-01
We consider the problem of estimation and variable selection for general linear regression models. Regularized regression procedures have been widely used for variable selection, but most existing methods perform poorly in the presence of outliers. We construct a new penalized procedure that simultaneously attains full efficiency and maximum robustness. Furthermore, the proposed procedure satisfies the oracle properties. The new procedure is designed to achieve sparse and robust solutions by imposing adaptive weights on both the decision loss and the penalty function. The proposed method of estimation and variable selection attains full efficiency when the model is correct and, at the same time, achieves maximum robustness when outliers are present. We examine the robustness properties using the finite-sample breakdown point and an influence function. We show that the proposed estimator attains the maximum breakdown point. Furthermore, there is no loss in efficiency when there are no outliers or the error distribution is normal. For practical implementation of the proposed method, we present a computational algorithm. We examine the finite-sample and robustness properties using Monte Carlo studies. Two datasets are also analyzed.
Nakamura, Yoshihiro; Hasegawa, Osamu
2017-01-01
With the ongoing development and expansion of communication networks and sensors, massive amounts of data are continuously generated in real time from real environments. Beforehand, prediction of a distribution underlying such data is difficult; furthermore, the data include substantial amounts of noise. These factors make it difficult to estimate probability densities. To handle these issues and massive amounts of data, we propose a nonparametric density estimator that rapidly learns data online and has high robustness. Our approach is an extension of both kernel density estimation (KDE) and a self-organizing incremental neural network (SOINN); therefore, we call our approach KDESOINN. An SOINN provides a clustering method that learns about the given data as networks of prototype of data; more specifically, an SOINN can learn the distribution underlying the given data. Using this information, KDESOINN estimates the probability density function. The results of our experiments show that KDESOINN outperforms or achieves performance comparable to the current state-of-the-art approaches in terms of robustness, learning time, and accuracy.
Robust learning for optimal treatment decision with NP-dimensionality
Shi, Chengchun; Song, Rui; Lu, Wenbin
2016-01-01
In order to identify important variables that are involved in making optimal treatment decision, Lu, Zhang and Zeng (2013) proposed a penalized least squared regression framework for a fixed number of predictors, which is robust against the misspecification of the conditional mean model. Two problems arise: (i) in a world of explosively big data, effective methods are needed to handle ultra-high dimensional data set, for example, with the dimension of predictors is of the non-polynomial (NP) order of the sample size; (ii) both the propensity score and conditional mean models need to be estimated from data under NP dimensionality. In this paper, we propose a robust procedure for estimating the optimal treatment regime under NP dimensionality. In both steps, penalized regressions are employed with the non-concave penalty function, where the conditional mean model of the response given predictors may be misspecified. The asymptotic properties, such as weak oracle properties, selection consistency and oracle distributions, of the proposed estimators are investigated. In addition, we study the limiting distribution of the estimated value function for the obtained optimal treatment regime. The empirical performance of the proposed estimation method is evaluated by simulations and an application to a depression dataset from the STAR*D study. PMID:28781717
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kane, V.E.
1982-01-01
A class of goodness-of-fit estimators is found to provide a useful alternative in certain situations to the standard maximum likelihood method which has some undesirable estimation characteristics for estimation from the three-parameter lognormal distribution. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Filliben tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Robustness of the procedures are examined and example data sets analyzed.
Artificial Intelligence (AI) Center of Excellence at the University of Pennsylvania
1995-07-01
that controls impact forces. Robust Location Estimation for MLR and Non-MLR Distributions (Dissertation Proposal) Gerda L. Kamberova MS-CIS-92-28...Bayesian Approach To Computer Vision Problems Gerda L. Kamberova MS-CIS-92-29 GRASP LAB 310 The object of our study is the Bayesian approach in...Estimation for MLR and Non-MLR Distributions (Dissertation) Gerda L. Kamberova MS-CIS-92-93 GRASP LAB 340 We study the problem of estimating an unknown
Evaluation of the robustness of estimating five components from a skin spectral image
NASA Astrophysics Data System (ADS)
Akaho, Rina; Hirose, Misa; Tsumura, Norimichi
2018-04-01
We evaluated the robustness of a method used to estimate five components (i.e., melanin, oxy-hemoglobin, deoxy-hemoglobin, shading, and surface reflectance) from the spectral reflectance of skin at five wavelengths against noise and a change in epidermis thickness. We also estimated the five components from recorded images of age spots and circles under the eyes using the method. We found that noise in the image must be no more 0.1% to accurately estimate the five components and that the thickness of the epidermis affects the estimation. We acquired the distribution of major causes for age spots and circles under the eyes by applying the method to recorded spectral images.
A Robust Adaptive Unscented Kalman Filter for Nonlinear Estimation with Uncertain Noise Covariance.
Zheng, Binqi; Fu, Pengcheng; Li, Baoqing; Yuan, Xiaobing
2018-03-07
The Unscented Kalman filter (UKF) may suffer from performance degradation and even divergence while mismatch between the noise distribution assumed as a priori by users and the actual ones in a real nonlinear system. To resolve this problem, this paper proposes a robust adaptive UKF (RAUKF) to improve the accuracy and robustness of state estimation with uncertain noise covariance. More specifically, at each timestep, a standard UKF will be implemented first to obtain the state estimations using the new acquired measurement data. Then an online fault-detection mechanism is adopted to judge if it is necessary to update current noise covariance. If necessary, innovation-based method and residual-based method are used to calculate the estimations of current noise covariance of process and measurement, respectively. By utilizing a weighting factor, the filter will combine the last noise covariance matrices with the estimations as the new noise covariance matrices. Finally, the state estimations will be corrected according to the new noise covariance matrices and previous state estimations. Compared with the standard UKF and other adaptive UKF algorithms, RAUKF converges faster to the actual noise covariance and thus achieves a better performance in terms of robustness, accuracy, and computation for nonlinear estimation with uncertain noise covariance, which is demonstrated by the simulation results.
A Robust Adaptive Unscented Kalman Filter for Nonlinear Estimation with Uncertain Noise Covariance
Zheng, Binqi; Yuan, Xiaobing
2018-01-01
The Unscented Kalman filter (UKF) may suffer from performance degradation and even divergence while mismatch between the noise distribution assumed as a priori by users and the actual ones in a real nonlinear system. To resolve this problem, this paper proposes a robust adaptive UKF (RAUKF) to improve the accuracy and robustness of state estimation with uncertain noise covariance. More specifically, at each timestep, a standard UKF will be implemented first to obtain the state estimations using the new acquired measurement data. Then an online fault-detection mechanism is adopted to judge if it is necessary to update current noise covariance. If necessary, innovation-based method and residual-based method are used to calculate the estimations of current noise covariance of process and measurement, respectively. By utilizing a weighting factor, the filter will combine the last noise covariance matrices with the estimations as the new noise covariance matrices. Finally, the state estimations will be corrected according to the new noise covariance matrices and previous state estimations. Compared with the standard UKF and other adaptive UKF algorithms, RAUKF converges faster to the actual noise covariance and thus achieves a better performance in terms of robustness, accuracy, and computation for nonlinear estimation with uncertain noise covariance, which is demonstrated by the simulation results. PMID:29518960
Robust versus consistent variance estimators in marginal structural Cox models.
Enders, Dirk; Engel, Susanne; Linder, Roland; Pigeot, Iris
2018-06-11
In survival analyses, inverse-probability-of-treatment (IPT) and inverse-probability-of-censoring (IPC) weighted estimators of parameters in marginal structural Cox models are often used to estimate treatment effects in the presence of time-dependent confounding and censoring. In most applications, a robust variance estimator of the IPT and IPC weighted estimator is calculated leading to conservative confidence intervals. This estimator assumes that the weights are known rather than estimated from the data. Although a consistent estimator of the asymptotic variance of the IPT and IPC weighted estimator is generally available, applications and thus information on the performance of the consistent estimator are lacking. Reasons might be a cumbersome implementation in statistical software, which is further complicated by missing details on the variance formula. In this paper, we therefore provide a detailed derivation of the variance of the asymptotic distribution of the IPT and IPC weighted estimator and explicitly state the necessary terms to calculate a consistent estimator of this variance. We compare the performance of the robust and consistent variance estimators in an application based on routine health care data and in a simulation study. The simulation reveals no substantial differences between the 2 estimators in medium and large data sets with no unmeasured confounding, but the consistent variance estimator performs poorly in small samples or under unmeasured confounding, if the number of confounders is large. We thus conclude that the robust estimator is more appropriate for all practical purposes. Copyright © 2018 John Wiley & Sons, Ltd.
An improved approximate-Bayesian model-choice method for estimating shared evolutionary history
2014-01-01
Background To understand biological diversification, it is important to account for large-scale processes that affect the evolutionary history of groups of co-distributed populations of organisms. Such events predict temporally clustered divergences times, a pattern that can be estimated using genetic data from co-distributed species. I introduce a new approximate-Bayesian method for comparative phylogeographical model-choice that estimates the temporal distribution of divergences across taxa from multi-locus DNA sequence data. The model is an extension of that implemented in msBayes. Results By reparameterizing the model, introducing more flexible priors on demographic and divergence-time parameters, and implementing a non-parametric Dirichlet-process prior over divergence models, I improved the robustness, accuracy, and power of the method for estimating shared evolutionary history across taxa. Conclusions The results demonstrate the improved performance of the new method is due to (1) more appropriate priors on divergence-time and demographic parameters that avoid prohibitively small marginal likelihoods for models with more divergence events, and (2) the Dirichlet-process providing a flexible prior on divergence histories that does not strongly disfavor models with intermediate numbers of divergence events. The new method yields more robust estimates of posterior uncertainty, and thus greatly reduces the tendency to incorrectly estimate models of shared evolutionary history with strong support. PMID:24992937
Bias and Efficiency in Structural Equation Modeling: Maximum Likelihood versus Robust Methods
ERIC Educational Resources Information Center
Zhong, Xiaoling; Yuan, Ke-Hai
2011-01-01
In the structural equation modeling literature, the normal-distribution-based maximum likelihood (ML) method is most widely used, partly because the resulting estimator is claimed to be asymptotically unbiased and most efficient. However, this may not hold when data deviate from normal distribution. Outlying cases or nonnormally distributed data,…
2dFLenS and KiDS: determining source redshift distributions with cross-correlations
NASA Astrophysics Data System (ADS)
Johnson, Andrew; Blake, Chris; Amon, Alexandra; Erben, Thomas; Glazebrook, Karl; Harnois-Deraps, Joachim; Heymans, Catherine; Hildebrandt, Hendrik; Joudaki, Shahab; Klaes, Dominik; Kuijken, Konrad; Lidman, Chris; Marin, Felipe A.; McFarland, John; Morrison, Christopher B.; Parkinson, David; Poole, Gregory B.; Radovich, Mario; Wolf, Christian
2017-03-01
We develop a statistical estimator to infer the redshift probability distribution of a photometric sample of galaxies from its angular cross-correlation in redshift bins with an overlapping spectroscopic sample. This estimator is a minimum-variance weighted quadratic function of the data: a quadratic estimator. This extends and modifies the methodology presented by McQuinn & White. The derived source redshift distribution is degenerate with the source galaxy bias, which must be constrained via additional assumptions. We apply this estimator to constrain source galaxy redshift distributions in the Kilo-Degree imaging survey through cross-correlation with the spectroscopic 2-degree Field Lensing Survey, presenting results first as a binned step-wise distribution in the range z < 0.8, and then building a continuous distribution using a Gaussian process model. We demonstrate the robustness of our methodology using mock catalogues constructed from N-body simulations, and comparisons with other techniques for inferring the redshift distribution.
Robustness of S1 statistic with Hodges-Lehmann for skewed distributions
NASA Astrophysics Data System (ADS)
Ahad, Nor Aishah; Yahaya, Sharipah Soaad Syed; Yin, Lee Ping
2016-10-01
Analysis of variance (ANOVA) is a common use parametric method to test the differences in means for more than two groups when the populations are normally distributed. ANOVA is highly inefficient under the influence of non- normal and heteroscedastic settings. When the assumptions are violated, researchers are looking for alternative such as Kruskal-Wallis under nonparametric or robust method. This study focused on flexible method, S1 statistic for comparing groups using median as the location estimator. S1 statistic was modified by substituting the median with Hodges-Lehmann and the default scale estimator with the variance of Hodges-Lehmann and MADn to produce two different test statistics for comparing groups. Bootstrap method was used for testing the hypotheses since the sampling distributions of these modified S1 statistics are unknown. The performance of the proposed statistic in terms of Type I error was measured and compared against the original S1 statistic, ANOVA and Kruskal-Wallis. The propose procedures show improvement compared to the original statistic especially under extremely skewed distribution.
Heterogeneous Data Fusion Method to Estimate Travel Time Distributions in Congested Road Networks
Lam, William H. K.; Li, Qingquan
2017-01-01
Travel times in congested urban road networks are highly stochastic. Provision of travel time distribution information, including both mean and variance, can be very useful for travelers to make reliable path choice decisions to ensure higher probability of on-time arrival. To this end, a heterogeneous data fusion method is proposed to estimate travel time distributions by fusing heterogeneous data from point and interval detectors. In the proposed method, link travel time distributions are first estimated from point detector observations. The travel time distributions of links without point detectors are imputed based on their spatial correlations with links that have point detectors. The estimated link travel time distributions are then fused with path travel time distributions obtained from the interval detectors using Dempster-Shafer evidence theory. Based on fused path travel time distribution, an optimization technique is further introduced to update link travel time distributions and their spatial correlations. A case study was performed using real-world data from Hong Kong and showed that the proposed method obtained accurate and robust estimations of link and path travel time distributions in congested road networks. PMID:29210978
Heterogeneous Data Fusion Method to Estimate Travel Time Distributions in Congested Road Networks.
Shi, Chaoyang; Chen, Bi Yu; Lam, William H K; Li, Qingquan
2017-12-06
Travel times in congested urban road networks are highly stochastic. Provision of travel time distribution information, including both mean and variance, can be very useful for travelers to make reliable path choice decisions to ensure higher probability of on-time arrival. To this end, a heterogeneous data fusion method is proposed to estimate travel time distributions by fusing heterogeneous data from point and interval detectors. In the proposed method, link travel time distributions are first estimated from point detector observations. The travel time distributions of links without point detectors are imputed based on their spatial correlations with links that have point detectors. The estimated link travel time distributions are then fused with path travel time distributions obtained from the interval detectors using Dempster-Shafer evidence theory. Based on fused path travel time distribution, an optimization technique is further introduced to update link travel time distributions and their spatial correlations. A case study was performed using real-world data from Hong Kong and showed that the proposed method obtained accurate and robust estimations of link and path travel time distributions in congested road networks.
Hame, Yrjo; Angelini, Elsa D; Hoffman, Eric A; Barr, R Graham; Laine, Andrew F
2014-07-01
The extent of pulmonary emphysema is commonly estimated from CT scans by computing the proportional area of voxels below a predefined attenuation threshold. However, the reliability of this approach is limited by several factors that affect the CT intensity distributions in the lung. This work presents a novel method for emphysema quantification, based on parametric modeling of intensity distributions and a hidden Markov measure field model to segment emphysematous regions. The framework adapts to the characteristics of an image to ensure a robust quantification of emphysema under varying CT imaging protocols, and differences in parenchymal intensity distributions due to factors such as inspiration level. Compared to standard approaches, the presented model involves a larger number of parameters, most of which can be estimated from data, to handle the variability encountered in lung CT scans. The method was applied on a longitudinal data set with 87 subjects and a total of 365 scans acquired with varying imaging protocols. The resulting emphysema estimates had very high intra-subject correlation values. By reducing sensitivity to changes in imaging protocol, the method provides a more robust estimate than standard approaches. The generated emphysema delineations promise advantages for regional analysis of emphysema extent and progression.
NASA Astrophysics Data System (ADS)
Liu, Hongjian; Wang, Zidong; Shen, Bo; Alsaadi, Fuad E.
2016-07-01
This paper deals with the robust H∞ state estimation problem for a class of memristive recurrent neural networks with stochastic time-delays. The stochastic time-delays under consideration are governed by a Bernoulli-distributed stochastic sequence. The purpose of the addressed problem is to design the robust state estimator such that the dynamics of the estimation error is exponentially stable in the mean square, and the prescribed ? performance constraint is met. By utilizing the difference inclusion theory and choosing a proper Lyapunov-Krasovskii functional, the existence condition of the desired estimator is derived. Based on it, the explicit expression of the estimator gain is given in terms of the solution to a linear matrix inequality. Finally, a numerical example is employed to demonstrate the effectiveness and applicability of the proposed estimation approach.
Robust Rate Maximization for Heterogeneous Wireless Networks under Channel Uncertainties
Xu, Yongjun; Hu, Yuan; Li, Guoquan
2018-01-01
Heterogeneous wireless networks are a promising technology in next generation wireless communication networks, which has been shown to efficiently reduce the blind area of mobile communication and improve network coverage compared with the traditional wireless communication networks. In this paper, a robust power allocation problem for a two-tier heterogeneous wireless networks is formulated based on orthogonal frequency-division multiplexing technology. Under the consideration of imperfect channel state information (CSI), the robust sum-rate maximization problem is built while avoiding sever cross-tier interference to macrocell user and maintaining the minimum rate requirement of each femtocell user. To be practical, both of channel estimation errors from the femtocells to the macrocell and link uncertainties of each femtocell user are simultaneously considered in terms of outage probabilities of users. The optimization problem is analyzed under no CSI feedback with some cumulative distribution function and partial CSI with Gaussian distribution of channel estimation error. The robust optimization problem is converted into the convex optimization problem which is solved by using Lagrange dual theory and subgradient algorithm. Simulation results demonstrate the effectiveness of the proposed algorithm by the impact of channel uncertainties on the system performance. PMID:29466315
Statistics based sampling for controller and estimator design
NASA Astrophysics Data System (ADS)
Tenne, Dirk
The purpose of this research is the development of statistical design tools for robust feed-forward/feedback controllers and nonlinear estimators. This dissertation is threefold and addresses the aforementioned topics nonlinear estimation, target tracking and robust control. To develop statistically robust controllers and nonlinear estimation algorithms, research has been performed to extend existing techniques, which propagate the statistics of the state, to achieve higher order accuracy. The so-called unscented transformation has been extended to capture higher order moments. Furthermore, higher order moment update algorithms based on a truncated power series have been developed. The proposed techniques are tested on various benchmark examples. Furthermore, the unscented transformation has been utilized to develop a three dimensional geometrically constrained target tracker. The proposed planar circular prediction algorithm has been developed in a local coordinate framework, which is amenable to extension of the tracking algorithm to three dimensional space. This tracker combines the predictions of a circular prediction algorithm and a constant velocity filter by utilizing the Covariance Intersection. This combined prediction can be updated with the subsequent measurement using a linear estimator. The proposed technique is illustrated on a 3D benchmark trajectory, which includes coordinated turns and straight line maneuvers. The third part of this dissertation addresses the design of controller which include knowledge of parametric uncertainties and their distributions. The parameter distributions are approximated by a finite set of points which are calculated by the unscented transformation. This set of points is used to design robust controllers which minimize a statistical performance of the plant over the domain of uncertainty consisting of a combination of the mean and variance. The proposed technique is illustrated on three benchmark problems. The first relates to the design of prefilters for a linear and nonlinear spring-mass-dashpot system and the second applies a feedback controller to a hovering helicopter. Lastly, the statistical robust controller design is devoted to a concurrent feed-forward/feedback controller structure for a high-speed low tension tape drive.
Zhan, Tingting; Chevoneva, Inna; Iglewicz, Boris
2010-01-01
The family of weighted likelihood estimators largely overlaps with minimum divergence estimators. They are robust to data contaminations compared to MLE. We define the class of generalized weighted likelihood estimators (GWLE), provide its influence function and discuss the efficiency requirements. We introduce a new truncated cubic-inverse weight, which is both first and second order efficient and more robust than previously reported weights. We also discuss new ways of selecting the smoothing bandwidth and weighted starting values for the iterative algorithm. The advantage of the truncated cubic-inverse weight is illustrated in a simulation study of three-components normal mixtures model with large overlaps and heavy contaminations. A real data example is also provided. PMID:20835375
NASA Astrophysics Data System (ADS)
Hernandez, F.; Liang, X.
2017-12-01
Reliable real-time hydrological forecasting, to predict important phenomena such as floods, is invaluable to the society. However, modern high-resolution distributed models have faced challenges when dealing with uncertainties that are caused by the large number of parameters and initial state estimations involved. Therefore, to rely on these high-resolution models for critical real-time forecast applications, considerable improvements on the parameter and initial state estimation techniques must be made. In this work we present a unified data assimilation algorithm called Optimized PareTo Inverse Modeling through Inverse STochastic Search (OPTIMISTS) to deal with the challenge of having robust flood forecasting for high-resolution distributed models. This new algorithm combines the advantages of particle filters and variational methods in a unique way to overcome their individual weaknesses. The analysis of candidate particles compares model results with observations in a flexible time frame, and a multi-objective approach is proposed which attempts to simultaneously minimize differences with the observations and departures from the background states by using both Bayesian sampling and non-convex evolutionary optimization. Moreover, the resulting Pareto front is given a probabilistic interpretation through kernel density estimation to create a non-Gaussian distribution of the states. OPTIMISTS was tested on a low-resolution distributed land surface model using VIC (Variable Infiltration Capacity) and on a high-resolution distributed hydrological model using the DHSVM (Distributed Hydrology Soil Vegetation Model). In the tests streamflow observations are assimilated. OPTIMISTS was also compared with a traditional particle filter and a variational method. Results show that our method can reliably produce adequate forecasts and that it is able to outperform those resulting from assimilating the observations using a particle filter or an evolutionary 4D variational method alone. In addition, our method is shown to be efficient in tackling high-resolution applications with robust results.
Linear models: permutation methods
Cade, B.S.; Everitt, B.S.; Howell, D.C.
2005-01-01
Permutation tests (see Permutation Based Inference) for the linear model have applications in behavioral studies when traditional parametric assumptions about the error term in a linear model are not tenable. Improved validity of Type I error rates can be achieved with properly constructed permutation tests. Perhaps more importantly, increased statistical power, improved robustness to effects of outliers, and detection of alternative distributional differences can be achieved by coupling permutation inference with alternative linear model estimators. For example, it is well-known that estimates of the mean in linear model are extremely sensitive to even a single outlying value of the dependent variable compared to estimates of the median [7, 19]. Traditionally, linear modeling focused on estimating changes in the center of distributions (means or medians). However, quantile regression allows distributional changes to be estimated in all or any selected part of a distribution or responses, providing a more complete statistical picture that has relevance to many biological questions [6]...
Filtering Based Adaptive Visual Odometry Sensor Framework Robust to Blurred Images
Zhao, Haiying; Liu, Yong; Xie, Xiaojia; Liao, Yiyi; Liu, Xixi
2016-01-01
Visual odometry (VO) estimation from blurred image is a challenging problem in practical robot applications, and the blurred images will severely reduce the estimation accuracy of the VO. In this paper, we address the problem of visual odometry estimation from blurred images, and present an adaptive visual odometry estimation framework robust to blurred images. Our approach employs an objective measure of images, named small image gradient distribution (SIGD), to evaluate the blurring degree of the image, then an adaptive blurred image classification algorithm is proposed to recognize the blurred images, finally we propose an anti-blurred key-frame selection algorithm to enable the VO robust to blurred images. We also carried out varied comparable experiments to evaluate the performance of the VO algorithms with our anti-blur framework under varied blurred images, and the experimental results show that our approach can achieve superior performance comparing to the state-of-the-art methods under the condition with blurred images while not increasing too much computation cost to the original VO algorithms. PMID:27399704
Statistical plant set estimation using Schroeder-phased multisinusoidal input design
NASA Technical Reports Server (NTRS)
Bayard, D. S.
1992-01-01
A frequency domain method is developed for plant set estimation. The estimation of a plant 'set' rather than a point estimate is required to support many methods of modern robust control design. The approach here is based on using a Schroeder-phased multisinusoid input design which has the special property of placing input energy only at the discrete frequency points used in the computation. A detailed analysis of the statistical properties of the frequency domain estimator is given, leading to exact expressions for the probability distribution of the estimation error, and many important properties. It is shown that, for any nominal parametric plant estimate, one can use these results to construct an overbound on the additive uncertainty to any prescribed statistical confidence. The 'soft' bound thus obtained can be used to replace 'hard' bounds presently used in many robust control analysis and synthesis methods.
A mixture model for robust registration in Kinect sensor
NASA Astrophysics Data System (ADS)
Peng, Li; Zhou, Huabing; Zhu, Shengguo
2018-03-01
The Microsoft Kinect sensor has been widely used in many applications, but it suffers from the drawback of low registration precision between color image and depth image. In this paper, we present a robust method to improve the registration precision by a mixture model that can handle multiply images with the nonparametric model. We impose non-parametric geometrical constraints on the correspondence, as a prior distribution, in a reproducing kernel Hilbert space (RKHS).The estimation is performed by the EM algorithm which by also estimating the variance of the prior model is able to obtain good estimates. We illustrate the proposed method on the public available dataset. The experimental results show that our approach outperforms the baseline methods.
Viana, Duarte S; Santamaría, Luis; Figuerola, Jordi
2016-02-01
Propagule retention time is a key factor in determining propagule dispersal distance and the shape of "seed shadows". Propagules dispersed by animal vectors are either ingested and retained in the gut until defecation or attached externally to the body until detachment. Retention time is a continuous variable, but it is commonly measured at discrete time points, according to pre-established sampling time-intervals. Although parametric continuous distributions have been widely fitted to these interval-censored data, the performance of different fitting methods has not been evaluated. To investigate the performance of five different fitting methods, we fitted parametric probability distributions to typical discretized retention-time data with known distribution using as data-points either the lower, mid or upper bounds of sampling intervals, as well as the cumulative distribution of observed values (using either maximum likelihood or non-linear least squares for parameter estimation); then compared the estimated and original distributions to assess the accuracy of each method. We also assessed the robustness of these methods to variations in the sampling procedure (sample size and length of sampling time-intervals). Fittings to the cumulative distribution performed better for all types of parametric distributions (lognormal, gamma and Weibull distributions) and were more robust to variations in sample size and sampling time-intervals. These estimated distributions had negligible deviations of up to 0.045 in cumulative probability of retention times (according to the Kolmogorov-Smirnov statistic) in relation to original distributions from which propagule retention time was simulated, supporting the overall accuracy of this fitting method. In contrast, fitting the sampling-interval bounds resulted in greater deviations that ranged from 0.058 to 0.273 in cumulative probability of retention times, which may introduce considerable biases in parameter estimates. We recommend the use of cumulative probability to fit parametric probability distributions to propagule retention time, specifically using maximum likelihood for parameter estimation. Furthermore, the experimental design for an optimal characterization of unimodal propagule retention time should contemplate at least 500 recovered propagules and sampling time-intervals not larger than the time peak of propagule retrieval, except in the tail of the distribution where broader sampling time-intervals may also produce accurate fits.
USDA-ARS?s Scientific Manuscript database
This study demonstrated a new method for mapping high-resolution (spatial: 1 m, and temporal: 1 h) soil moisture by assimilating distributed temperature sensing (DTS) observed soil temperatures at intermediate scales. In order to provide robust soil moisture and property estimates, we first proposed...
Optimal designs based on the maximum quasi-likelihood estimator
Shen, Gang; Hyun, Seung Won; Wong, Weng Kee
2016-01-01
We use optimal design theory and construct locally optimal designs based on the maximum quasi-likelihood estimator (MqLE), which is derived under less stringent conditions than those required for the MLE method. We show that the proposed locally optimal designs are asymptotically as efficient as those based on the MLE when the error distribution is from an exponential family, and they perform just as well or better than optimal designs based on any other asymptotically linear unbiased estimators such as the least square estimator (LSE). In addition, we show current algorithms for finding optimal designs can be directly used to find optimal designs based on the MqLE. As an illustrative application, we construct a variety of locally optimal designs based on the MqLE for the 4-parameter logistic (4PL) model and study their robustness properties to misspecifications in the model using asymptotic relative efficiency. The results suggest that optimal designs based on the MqLE can be easily generated and they are quite robust to mis-specification in the probability distribution of the responses. PMID:28163359
Robust allocation of a defensive budget considering an attacker's private information.
Nikoofal, Mohammad E; Zhuang, Jun
2012-05-01
Attackers' private information is one of the main issues in defensive resource allocation games in homeland security. The outcome of a defense resource allocation decision critically depends on the accuracy of estimations about the attacker's attributes. However, terrorists' goals may be unknown to the defender, necessitating robust decisions by the defender. This article develops a robust-optimization game-theoretical model for identifying optimal defense resource allocation strategies for a rational defender facing a strategic attacker while the attacker's valuation of targets, being the most critical attribute of the attacker, is unknown but belongs to bounded distribution-free intervals. To our best knowledge, no previous research has applied robust optimization in homeland security resource allocation when uncertainty is defined in bounded distribution-free intervals. The key features of our model include (1) modeling uncertainty in attackers' attributes, where uncertainty is characterized by bounded intervals; (2) finding the robust-optimization equilibrium for the defender using concepts dealing with budget of uncertainty and price of robustness; and (3) applying the proposed model to real data. © 2011 Society for Risk Analysis.
NASA Astrophysics Data System (ADS)
Ishigaki, Tsukasa; Yamamoto, Yoshinobu; Nakamura, Yoshiyuki; Akamatsu, Motoyuki
Patients that have an health service by doctor have to wait long time at many hospitals. The long waiting time is the worst factor of patient's dissatisfaction for hospital service according to questionnaire for patients. The present paper describes an estimation method of the waiting time for each patient without an electronic medical chart system. The method applies a portable RFID system to data acquisition and robust estimation of probability distribution of the health service and test time by doctor for high-accurate waiting time estimation. We carried out an health service of data acquisition at a real hospital and verified the efficiency of the proposed method. The proposed system widely can be used as data acquisition system in various fields such as marketing service, entertainment or human behavior measurement.
A Comparative Study of Distribution System Parameter Estimation Methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun, Yannan; Williams, Tess L.; Gourisetti, Sri Nikhil Gup
2016-07-17
In this paper, we compare two parameter estimation methods for distribution systems: residual sensitivity analysis and state-vector augmentation with a Kalman filter. These two methods were originally proposed for transmission systems, and are still the most commonly used methods for parameter estimation. Distribution systems have much lower measurement redundancy than transmission systems. Therefore, estimating parameters is much more difficult. To increase the robustness of parameter estimation, the two methods are applied with combined measurement snapshots (measurement sets taken at different points in time), so that the redundancy for computing the parameter values is increased. The advantages and disadvantages of bothmore » methods are discussed. The results of this paper show that state-vector augmentation is a better approach for parameter estimation in distribution systems. Simulation studies are done on a modified version of IEEE 13-Node Test Feeder with varying levels of measurement noise and non-zero error in the other system model parameters.« less
Using Biweight M-Estimates in the Two-Sample Problem. 1. Symmetric Populations
1982-01-01
to a Student’s t distribution, across a broad range of a - levels . To be conservative, we might wish to approximate "t" by a Student’s t on nine-tenths...n-i0). While the robustness of classical procedures for extreme a - levels has not been investigated, a comparison with the values in Lee and...D’Agostino (1976) indicates that this procedure is highly robust of validity at a - .05, presumably this robustness extends to the extreme a - levels as well
Robust functional regression model for marginal mean and subject-specific inferences.
Cao, Chunzheng; Shi, Jian Qing; Lee, Youngjo
2017-01-01
We introduce flexible robust functional regression models, using various heavy-tailed processes, including a Student t-process. We propose efficient algorithms in estimating parameters for the marginal mean inferences and in predicting conditional means as well as interpolation and extrapolation for the subject-specific inferences. We develop bootstrap prediction intervals (PIs) for conditional mean curves. Numerical studies show that the proposed model provides a robust approach against data contamination or distribution misspecification, and the proposed PIs maintain the nominal confidence levels. A real data application is presented as an illustrative example.
Singer, Donald A.; Menzie, W.D.; Cheng, Qiuming; Bonham-Carter, G. F.
2005-01-01
Estimating numbers of undiscovered mineral deposits is a fundamental part of assessing mineral resources. Some statistical tools can act as guides to low variance, unbiased estimates of the number of deposits. The primary guide is that the estimates must be consistent with the grade and tonnage models. Another statistical guide is the deposit density (i.e., the number of deposits per unit area of permissive rock in well-explored control areas). Preliminary estimates and confidence limits of the number of undiscovered deposits in a tract of given area may be calculated using linear regression and refined using frequency distributions with appropriate parameters. A Poisson distribution leads to estimates having lower relative variances than the regression estimates and implies a random distribution of deposits. Coefficients of variation are used to compare uncertainties of negative binomial, Poisson, or MARK3 empirical distributions that have the same expected number of deposits as the deposit density. Statistical guides presented here allow simple yet robust estimation of the number of undiscovered deposits in permissive terranes.
Scheduling policies of intelligent sensors and sensor/actuators in flexible structures
NASA Astrophysics Data System (ADS)
Demetriou, Michael A.; Potami, Raffaele
2006-03-01
In this note, we revisit the problem of actuator/sensor placement in large civil infrastructures and flexible space structures within the context of spatial robustness. The positioning of these devices becomes more important in systems employing wireless sensor and actuator networks (WSAN) for improved control performance and for rapid failure detection. The ability of the sensing and actuating devices to possess the property of spatial robustness results in reduced control energy and therefore the spatial distribution of disturbances is integrated into the location optimization measures. In our studies, the structure under consideration is a flexible plate clamped at all sides. First, we consider the case of sensor placement and the optimization scheme attempts to produce those locations that minimize the effects of the spatial distribution of disturbances on the state estimation error; thus the sensor locations produce state estimators with minimized disturbance-to-error transfer function norms. A two-stage optimization procedure is employed whereby one first considers the open loop system and the spatial distribution of disturbances is found that produces the maximal effects on the entire open loop state. Once this "worst" spatial distribution of disturbances is found, the optimization scheme subsequently finds the locations that produce state estimators with minimum transfer function norms. In the second part, we consider the collocated actuator/sensor pairs and the optimization scheme produces those locations that result in compensators with the smallest norms of the disturbance-to-state transfer functions. Going a step further, an intelligent control scheme is presented which, at each time interval, activates a subset of the actuator/sensor pairs in order provide robustness against spatiotemporally moving disturbances and minimize power consumption by keeping some sensor/actuators in sleep mode.
Radar modulation classification using time-frequency representation and nonlinear regression
NASA Astrophysics Data System (ADS)
De Luigi, Christophe; Arques, Pierre-Yves; Lopez, Jean-Marc; Moreau, Eric
1999-09-01
In naval electronic environment, pulses emitted by radars are collected by ESM receivers. For most of them the intrapulse signal is modulated by a particular law. To help the classical identification process, a classification and estimation of this modulation law is applied on the intrapulse signal measurements. To estimate with a good accuracy the time-varying frequency of a signal corrupted by an additive noise, one method has been chosen. This method consists on the Wigner distribution calculation, the instantaneous frequency is then estimated by the peak location of the distribution. Bias and variance of the estimator are performed by computed simulations. In a estimated sequence of frequencies, we assume the presence of false and good estimated ones, the hypothesis of Gaussian distribution is made on the errors. A robust non linear regression method, based on the Levenberg-Marquardt algorithm, is thus applied on these estimated frequencies using a Maximum Likelihood Estimator. The performances of the method are tested by using varied modulation laws and different signal to noise ratios.
Bayesian model selection: Evidence estimation based on DREAM simulation and bridge sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-04-01
Bayesian inference has found widespread application in Earth and Environmental Systems Modeling, providing an effective tool for prediction, data assimilation, parameter estimation, uncertainty analysis and hypothesis testing. Under multiple competing hypotheses, the Bayesian approach also provides an attractive alternative to traditional information criteria (e.g. AIC, BIC) for model selection. The key variable for Bayesian model selection is the evidence (or marginal likelihood) that is the normalizing constant in the denominator of Bayes theorem; while it is fundamental for model selection, the evidence is not required for Bayesian inference. It is computed for each hypothesis (model) by averaging the likelihood function over the prior parameter distribution, rather than maximizing it as by information criteria; the larger a model evidence the more support it receives among a collection of hypothesis as the simulated values assign relatively high probability density to the observed data. Hence, the evidence naturally acts as an Occam's razor, preferring simpler and more constrained models against the selection of over-fitted ones by information criteria that incorporate only the likelihood maximum. Since it is not particularly easy to estimate the evidence in practice, Bayesian model selection via the marginal likelihood has not yet found mainstream use. We illustrate here the properties of a new estimator of the Bayesian model evidence, which provides robust and unbiased estimates of the marginal likelihood; the method is coined Gaussian Mixture Importance Sampling (GMIS). GMIS uses multidimensional numerical integration of the posterior parameter distribution via bridge sampling (a generalization of importance sampling) of a mixture distribution fitted to samples of the posterior distribution derived from the DREAM algorithm (Vrugt et al., 2008; 2009). Some illustrative examples are presented to show the robustness and superiority of the GMIS estimator with respect to other commonly used approaches in the literature.
NASA Astrophysics Data System (ADS)
Liu, Sha; Liu, Shi; Tong, Guowei
2017-11-01
In industrial areas, temperature distribution information provides a powerful data support for improving system efficiency, reducing pollutant emission, ensuring safety operation, etc. As a noninvasive measurement technology, acoustic tomography (AT) has been widely used to measure temperature distribution where the efficiency of the reconstruction algorithm is crucial for the reliability of the measurement results. Different from traditional reconstruction techniques, in this paper a two-phase reconstruction method is proposed to ameliorate the reconstruction accuracy (RA). In the first phase, the measurement domain is discretized by a coarse square grid to reduce the number of unknown variables to mitigate the ill-posed nature of the AT inverse problem. By taking into consideration the inaccuracy of the measured time-of-flight data, a new cost function is constructed to improve the robustness of the estimation, and a grey wolf optimizer is used to solve the proposed cost function to obtain the temperature distribution on the coarse grid. In the second phase, the Adaboost.RT based BP neural network algorithm is developed for predicting the temperature distribution on the refined grid in accordance with the temperature distribution data estimated in the first phase. Numerical simulations and experiment measurement results validate the superiority of the proposed reconstruction algorithm in improving the robustness and RA.
On the inequivalence of the CH and CHSH inequalities due to finite statistics
NASA Astrophysics Data System (ADS)
Renou, M. O.; Rosset, D.; Martin, A.; Gisin, N.
2017-06-01
Different variants of a Bell inequality, such as CHSH and CH, are known to be equivalent when evaluated on nonsignaling outcome probability distributions. However, in experimental setups, the outcome probability distributions are estimated using a finite number of samples. Therefore the nonsignaling conditions are only approximately satisfied and the robustness of the violation depends on the chosen inequality variant. We explain that phenomenon using the decomposition of the space of outcome probability distributions under the action of the symmetry group of the scenario, and propose a method to optimize the statistical robustness of a Bell inequality. In the process, we describe the finite group composed of relabeling of parties, measurement settings and outcomes, and identify correspondences between the irreducible representations of this group and properties of outcome probability distributions such as normalization, signaling or having uniform marginals.
A Robust Statistics Approach to Minimum Variance Portfolio Optimization
NASA Astrophysics Data System (ADS)
Yang, Liusha; Couillet, Romain; McKay, Matthew R.
2015-12-01
We study the design of portfolios under a minimum risk criterion. The performance of the optimized portfolio relies on the accuracy of the estimated covariance matrix of the portfolio asset returns. For large portfolios, the number of available market returns is often of similar order to the number of assets, so that the sample covariance matrix performs poorly as a covariance estimator. Additionally, financial market data often contain outliers which, if not correctly handled, may further corrupt the covariance estimation. We address these shortcomings by studying the performance of a hybrid covariance matrix estimator based on Tyler's robust M-estimator and on Ledoit-Wolf's shrinkage estimator while assuming samples with heavy-tailed distribution. Employing recent results from random matrix theory, we develop a consistent estimator of (a scaled version of) the realized portfolio risk, which is minimized by optimizing online the shrinkage intensity. Our portfolio optimization method is shown via simulations to outperform existing methods both for synthetic and real market data.
Bio-inspired sensing and control for disturbance rejection and stabilization
NASA Astrophysics Data System (ADS)
Gremillion, Gregory; Humbert, James S.
2015-05-01
The successful operation of small unmanned aircraft systems (sUAS) in dynamic environments demands robust stability in the presence of exogenous disturbances. Flying insects are sensor-rich platforms, with highly redundant arrays of sensors distributed across the insect body that are integrated to extract rich information with diminished noise. This work presents a novel sensing framework in which measurements from an array of accelerometers distributed across a simulated flight vehicle are linearly combined to directly estimate the applied forces and torques with improvements in SNR. In simulation, the estimation performance is quantified as a function of sensor noise level, position estimate error, and sensor quantity.
Estimation of reference intervals from small samples: an example using canine plasma creatinine.
Geffré, A; Braun, J P; Trumel, C; Concordet, D
2009-12-01
According to international recommendations, reference intervals should be determined from at least 120 reference individuals, which often are impossible to achieve in veterinary clinical pathology, especially for wild animals. When only a small number of reference subjects is available, the possible bias cannot be known and the normality of the distribution cannot be evaluated. A comparison of reference intervals estimated by different methods could be helpful. The purpose of this study was to compare reference limits determined from a large set of canine plasma creatinine reference values, and large subsets of this data, with estimates obtained from small samples selected randomly. Twenty sets each of 120 and 27 samples were randomly selected from a set of 1439 plasma creatinine results obtained from healthy dogs in another study. Reference intervals for the whole sample and for the large samples were determined by a nonparametric method. The estimated reference limits for the small samples were minimum and maximum, mean +/- 2 SD of native and Box-Cox-transformed values, 2.5th and 97.5th percentiles by a robust method on native and Box-Cox-transformed values, and estimates from diagrams of cumulative distribution functions. The whole sample had a heavily skewed distribution, which approached Gaussian after Box-Cox transformation. The reference limits estimated from small samples were highly variable. The closest estimates to the 1439-result reference interval for 27-result subsamples were obtained by both parametric and robust methods after Box-Cox transformation but were grossly erroneous in some cases. For small samples, it is recommended that all values be reported graphically in a dot plot or histogram and that estimates of the reference limits be compared using different methods.
NASA Astrophysics Data System (ADS)
Kwon, Ki-Won; Cho, Yongsoo
This letter presents a simple joint estimation method for residual frequency offset (RFO) and sampling frequency offset (STO) in OFDM-based digital video broadcasting (DVB) systems. The proposed method selects a continual pilot (CP) subset from an unsymmetrically and non-uniformly distributed CP set to obtain an unbiased estimator. Simulation results show that the proposed method using a properly selected CP subset is unbiased and performs robustly.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework.
Zhu, Hongxiao; Brown, Philip J; Morris, Jeffrey S
2011-09-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets.
Robust, Adaptive Functional Regression in Functional Mixed Model Framework
Zhu, Hongxiao; Brown, Philip J.; Morris, Jeffrey S.
2012-01-01
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets. PMID:22308015
Estimation of rates-across-sites distributions in phylogenetic substitution models.
Susko, Edward; Field, Chris; Blouin, Christian; Roger, Andrew J
2003-10-01
Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma. In this article, we introduce discrete distribution estimates with large numbers of equally spaced rate categories allowing us to investigate the appropriateness of the gamma model. With large numbers of rate categories, these discrete estimates are flexible enough to approximate the shape of almost any distribution. Likelihood ratio statistical tests and a nonparametric bootstrap confidence-bound estimation procedure based on the discrete estimates are presented that can be used to test the fit of a parametric family. We applied the methodology to several different protein data sets, and found that although the gamma model often provides a good parametric model for this type of data, rate estimates from an equal-probability discrete gamma model with a small number of categories will tend to underestimate the largest rates. In cases when the gamma model assumption is in doubt, rate estimates coming from the discrete rate distribution estimate with a large number of rate categories provide a robust alternative to gamma estimates. An alternative implementation of the gamma distribution is proposed that, for equal numbers of rate categories, is computationally more efficient during optimization than the standard gamma implementation and can provide more accurate estimates of site rates.
Robust estimation of microbial diversity in theory and in practice
Haegeman, Bart; Hamelin, Jérôme; Moriarty, John; Neal, Peter; Dushoff, Jonathan; Weitz, Joshua S
2013-01-01
Quantifying diversity is of central importance for the study of structure, function and evolution of microbial communities. The estimation of microbial diversity has received renewed attention with the advent of large-scale metagenomic studies. Here, we consider what the diversity observed in a sample tells us about the diversity of the community being sampled. First, we argue that one cannot reliably estimate the absolute and relative number of microbial species present in a community without making unsupported assumptions about species abundance distributions. The reason for this is that sample data do not contain information about the number of rare species in the tail of species abundance distributions. We illustrate the difficulty in comparing species richness estimates by applying Chao's estimator of species richness to a set of in silico communities: they are ranked incorrectly in the presence of large numbers of rare species. Next, we extend our analysis to a general family of diversity metrics (‘Hill diversities'), and construct lower and upper estimates of diversity values consistent with the sample data. The theory generalizes Chao's estimator, which we retrieve as the lower estimate of species richness. We show that Shannon and Simpson diversity can be robustly estimated for the in silico communities. We analyze nine metagenomic data sets from a wide range of environments, and show that our findings are relevant for empirically-sampled communities. Hence, we recommend the use of Shannon and Simpson diversity rather than species richness in efforts to quantify and compare microbial diversity. PMID:23407313
Abanto-Valle, C. A.; Bandyopadhyay, D.; Lachos, V. H.; Enriquez, I.
2009-01-01
A Bayesian analysis of stochastic volatility (SV) models using the class of symmetric scale mixtures of normal (SMN) distributions is considered. In the face of non-normality, this provides an appealing robust alternative to the routine use of the normal distribution. Specific distributions examined include the normal, student-t, slash and the variance gamma distributions. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo (MCMC) algorithm is introduced for parameter estimation. Moreover, the mixing parameters obtained as a by-product of the scale mixture representation can be used to identify outliers. The methods developed are applied to analyze daily stock returns data on S&P500 index. Bayesian model selection criteria as well as out-of- sample forecasting results reveal that the SV models based on heavy-tailed SMN distributions provide significant improvement in model fit as well as prediction to the S&P500 index data over the usual normal model. PMID:20730043
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kane, V.E.
1979-10-01
The standard maximum likelihood and moment estimation procedures are shown to have some undesirable characteristics for estimating the parameters in a three-parameter lognormal distribution. A class of goodness-of-fit estimators is found which provides a useful alternative to the standard methods. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Shapiro-Francia tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted-order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Bias and robustness of the procedures are examined and example data sets analyzed including geochemical datamore » from the National Uranium Resource Evaluation Program.« less
On the contributions of topological features to transcriptional regulatory network robustness
2012-01-01
Background Because biological networks exhibit a high-degree of robustness, a systemic understanding of their architecture and function requires an appraisal of the network design principles that confer robustness. In this project, we conduct a computational study of the contribution of three degree-based topological properties (transcription factor-target ratio, degree distribution, cross-talk suppression) and their combinations on the robustness of transcriptional regulatory networks. We seek to quantify the relative degree of robustness conferred by each property (and combination) and also to determine the extent to which these properties alone can explain the robustness observed in transcriptional networks. Results To study individual properties and their combinations, we generated synthetic, random networks that retained one or more of the three properties with values derived from either the yeast or E. coli gene regulatory networks. Robustness of these networks were estimated through simulation. Our results indicate that the combination of the three properties we considered explains the majority of the structural robustness observed in the real transcriptional networks. Surprisingly, scale-free degree distribution is, overall, a minor contributor to robustness. Instead, most robustness is gained through topological features that limit the complexity of the overall network and increase the transcription factor subnetwork sparsity. Conclusions Our work demonstrates that (i) different types of robustness are implemented by different topological aspects of the network and (ii) size and sparsity of the transcription factor subnetwork play an important role for robustness induction. Our results are conserved across yeast and E Coli, which suggests that the design principles examined are present within an array of living systems. PMID:23194062
Shoari, Niloofar; Dubé, Jean-Sébastien; Chenouri, Shoja'eddin
2015-11-01
In environmental studies, concentration measurements frequently fall below detection limits of measuring instruments, resulting in left-censored data. Some studies employ parametric methods such as the maximum likelihood estimator (MLE), robust regression on order statistic (rROS), and gamma regression on order statistic (GROS), while others suggest a non-parametric approach, the Kaplan-Meier method (KM). Using examples of real data from a soil characterization study in Montreal, we highlight the need for additional investigations that aim at unifying the existing literature. A number of studies have examined this issue; however, those considering data skewness and model misspecification are rare. These aspects are investigated in this paper through simulations. Among other findings, results show that for low skewed data, the performance of different statistical methods is comparable, regardless of the censoring percentage and sample size. For highly skewed data, the performance of the MLE method under lognormal and Weibull distributions is questionable; particularly, when the sample size is small or censoring percentage is high. In such conditions, MLE under gamma distribution, rROS, GROS, and KM are less sensitive to skewness. Related to model misspecification, MLE based on lognormal and Weibull distributions provides poor estimates when the true distribution of data is misspecified. However, the methods of rROS, GROS, and MLE under gamma distribution are generally robust to model misspecifications regardless of skewness, sample size, and censoring percentage. Since the characteristics of environmental data (e.g., type of distribution and skewness) are unknown a priori, we suggest using MLE based on gamma distribution, rROS and GROS. Copyright © 2015 Elsevier Ltd. All rights reserved.
Fault Detection of Rotating Machinery using the Spectral Distribution Function
NASA Technical Reports Server (NTRS)
Davis, Sanford S.
1997-01-01
The spectral distribution function is introduced to characterize the process leading to faults in rotating machinery. It is shown to be a more robust indicator than conventional power spectral density estimates, but requires only slightly more computational effort. The method is illustrated with examples from seeded gearbox transmission faults and an analytical model of a defective bearing. Procedures are suggested for implementation in realistic environments.
Leveraging AMI data for distribution system model calibration and situational awareness
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peppanen, Jouni; Reno, Matthew J.; Thakkar, Mohini
The many new distributed energy resources being installed at the distribution system level require increased visibility into system operations that will be enabled by distribution system state estimation (DSSE) and situational awareness applications. Reliable and accurate DSSE requires both robust methods for managing the big data provided by smart meters and quality distribution system models. This paper presents intelligent methods for detecting and dealing with missing or inaccurate smart meter data, as well as the ways to process the data for different applications. It also presents an efficient and flexible parameter estimation method based on the voltage drop equation andmore » regression analysis to enhance distribution system model accuracy. Finally, it presents a 3-D graphical user interface for advanced visualization of the system state and events. Moreover, we demonstrate this paper for a university distribution network with the state-of-the-art real-time and historical smart meter data infrastructure.« less
Leveraging AMI data for distribution system model calibration and situational awareness
Peppanen, Jouni; Reno, Matthew J.; Thakkar, Mohini; ...
2015-01-15
The many new distributed energy resources being installed at the distribution system level require increased visibility into system operations that will be enabled by distribution system state estimation (DSSE) and situational awareness applications. Reliable and accurate DSSE requires both robust methods for managing the big data provided by smart meters and quality distribution system models. This paper presents intelligent methods for detecting and dealing with missing or inaccurate smart meter data, as well as the ways to process the data for different applications. It also presents an efficient and flexible parameter estimation method based on the voltage drop equation andmore » regression analysis to enhance distribution system model accuracy. Finally, it presents a 3-D graphical user interface for advanced visualization of the system state and events. Moreover, we demonstrate this paper for a university distribution network with the state-of-the-art real-time and historical smart meter data infrastructure.« less
Gilliom, Robert J.; Helsel, Dennis R.
1986-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1986-02-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
Robust adaptive multichannel SAR processing based on covariance matrix reconstruction
NASA Astrophysics Data System (ADS)
Tan, Zhen-ya; He, Feng
2018-04-01
With the combination of digital beamforming (DBF) processing, multichannel synthetic aperture radar(SAR) systems in azimuth promise well in high-resolution and wide-swath imaging, whereas conventional processing methods don't take the nonuniformity of scattering coefficient into consideration. This paper brings up a robust adaptive Multichannel SAR processing method which utilizes the Capon spatial spectrum estimator to obtain the spatial spectrum distribution over all ambiguous directions first, and then the interference-plus-noise covariance Matrix is reconstructed based on definition to acquire the Multichannel SAR processing filter. The performance of processing under nonuniform scattering coefficient is promoted by this novel method and it is robust again array errors. The experiments with real measured data demonstrate the effectiveness and robustness of the proposed method.
Häme, Yrjö; Angelini, Elsa D.; Hoffman, Eric A.; Barr, R. Graham; Laine, Andrew F.
2014-01-01
The extent of pulmonary emphysema is commonly estimated from CT images by computing the proportional area of voxels below a predefined attenuation threshold. However, the reliability of this approach is limited by several factors that affect the CT intensity distributions in the lung. This work presents a novel method for emphysema quantification, based on parametric modeling of intensity distributions in the lung and a hidden Markov measure field model to segment emphysematous regions. The framework adapts to the characteristics of an image to ensure a robust quantification of emphysema under varying CT imaging protocols and differences in parenchymal intensity distributions due to factors such as inspiration level. Compared to standard approaches, the present model involves a larger number of parameters, most of which can be estimated from data, to handle the variability encountered in lung CT scans. The method was used to quantify emphysema on a cohort of 87 subjects, with repeated CT scans acquired over a time period of 8 years using different imaging protocols. The scans were acquired approximately annually, and the data set included a total of 365 scans. The results show that the emphysema estimates produced by the proposed method have very high intra-subject correlation values. By reducing sensitivity to changes in imaging protocol, the method provides a more robust estimate than standard approaches. In addition, the generated emphysema delineations promise great advantages for regional analysis of emphysema extent and progression, possibly advancing disease subtyping. PMID:24759984
Robust Multiple Linear Regression.
1982-12-01
difficulty, but it might have more solutions corresponding to local minima. Influence Function of M-Estimates The influence function describes the effect...distributionn n function. In case of M-Estimates the influence function was found to be pro- portional to and given as T(X F)) " C(xpF,T) = .(X.T(F) F(dx...where the inverse of any distribution function F is defined in the usual way as F- (s) = inf{x IF(x) > s) 0<sə Influence Function of L-Estimates In a
Whitmore, Roy W; Chen, Wenlin
2013-12-04
The ability to infer human exposure to substances from drinking water using monitoring data helps determine and/or refine potential risks associated with drinking water consumption. We describe a survey sampling approach and its application to an atrazine groundwater monitoring study to adequately characterize upper exposure centiles and associated confidence intervals with predetermined precision. Study design and data analysis included sampling frame definition, sample stratification, sample size determination, allocation to strata, analysis weights, and weighted population estimates. Sampling frame encompassed 15 840 groundwater community water systems (CWS) in 21 states throughout the U. S. Median, and 95th percentile atrazine concentrations were 0.0022 and 0.024 ppb, respectively, for all CWS. Statistical estimates agreed with historical monitoring results, suggesting that the study design was adequate and robust. This methodology makes no assumptions regarding the occurrence distribution (e.g., lognormality); thus analyses based on the design-induced distribution provide the most robust basis for making inferences from the sample to target population.
Recovering Galaxy Properties Using Gaussian Process SED Fitting
NASA Astrophysics Data System (ADS)
Iyer, Kartheik; Awan, Humna
2018-01-01
Information about physical quantities like the stellar mass, star formation rates, and ages for distant galaxies is contained in their spectral energy distributions (SEDs), obtained through photometric surveys like SDSS, CANDELS, LSST etc. However, noise in the photometric observations often is a problem, and using naive machine learning methods to estimate physical quantities can result in overfitting the noise, or converging on solutions that lie outside the physical regime of parameter space.We use Gaussian Process regression trained on a sample of SEDs corresponding to galaxies from a Semi-Analytic model (Somerville+15a) to estimate their stellar masses, and compare its performance to a variety of different methods, including simple linear regression, Random Forests, and k-Nearest Neighbours. We find that the Gaussian Process method is robust to noise and predicts not only stellar masses but also their uncertainties. The method is also robust in the cases where the distribution of the training data is not identical to the target data, which can be extremely useful when generalized to more subtle galaxy properties.
Estimation of distributional parameters for censored trace-level water-quality data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilliom, R.J.; Helsel, D.R.
1984-01-01
A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water-sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations,more » for determining the best-performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least-squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification. 6 figs., 6 tabs.« less
Estimating restricted mean treatment effects with stacked survival models
Wey, Andrew; Vock, David M.; Connett, John; Rudser, Kyle
2016-01-01
The difference in restricted mean survival times between two groups is a clinically relevant summary measure. With observational data, there may be imbalances in confounding variables between the two groups. One approach to account for such imbalances is estimating a covariate-adjusted restricted mean difference by modeling the covariate-adjusted survival distribution, and then marginalizing over the covariate distribution. Since the estimator for the restricted mean difference is defined by the estimator for the covariate-adjusted survival distribution, it is natural to expect that a better estimator of the covariate-adjusted survival distribution is associated with a better estimator of the restricted mean difference. We therefore propose estimating restricted mean differences with stacked survival models. Stacked survival models estimate a weighted average of several survival models by minimizing predicted error. By including a range of parametric, semi-parametric, and non-parametric models, stacked survival models can robustly estimate a covariate-adjusted survival distribution and, therefore, the restricted mean treatment effect in a wide range of scenarios. We demonstrate through a simulation study that better performance of the covariate-adjusted survival distribution often leads to better mean-squared error of the restricted mean difference although there are notable exceptions. In addition, we demonstrate that the proposed estimator can perform nearly as well as Cox regression when the proportional hazards assumption is satisfied and significantly better when proportional hazards is violated. Finally, the proposed estimator is illustrated with data from the United Network for Organ Sharing to evaluate post-lung transplant survival between large and small-volume centers. PMID:26934835
L-moments and TL-moments of the generalized lambda distribution
Asquith, W.H.
2007-01-01
The 4-parameter generalized lambda distribution (GLD) is a flexible distribution capable of mimicking the shapes of many distributions and data samples including those with heavy tails. The method of L-moments and the recently developed method of trimmed L-moments (TL-moments) are attractive techniques for parameter estimation for heavy-tailed distributions for which the L- and TL-moments have been defined. Analytical solutions for the first five L- and TL-moments in terms of GLD parameters are derived. Unfortunately, numerical methods are needed to compute the parameters from the L- or TL-moments. Algorithms are suggested for parameter estimation. Application of the GLD using both L- and TL-moment parameter estimates from example data is demonstrated, and comparison of the L-moment fit of the 4-parameter kappa distribution is made. A small simulation study of the 98th percentile (far-right tail) is conducted for a heavy-tail GLD with high-outlier contamination. The simulations show, with respect to estimation of the 98th-percent quantile, that TL-moments are less biased (more robost) in the presence of high-outlier contamination. However, the robustness comes at the expense of considerably more sampling variability. ?? 2006 Elsevier B.V. All rights reserved.
Maximum likelihood phase-retrieval algorithm: applications.
Nahrstedt, D A; Southwell, W H
1984-12-01
The maximum likelihood estimator approach is shown to be effective in determining the wave front aberration in systems involving laser and flow field diagnostics and optical testing. The robustness of the algorithm enables convergence even in cases of severe wave front error and real, nonsymmetrical, obscured amplitude distributions.
Observability and Estimation of Distributed Space Systems via Local Information-Exchange Networks
NASA Technical Reports Server (NTRS)
Rahmani, Amirreza; Mesbahi, Mehran; Fathpour, Nanaz; Hadaegh, Fred Y.
2008-01-01
In this work, we develop an approach to formation estimation by explicitly characterizing formation's system-theoretic attributes in terms of the underlying inter-spacecraft information-exchange network. In particular, we approach the formation observer/estimator design by relaxing the accessibility to the global state information by a centralized observer/estimator- and in turn- providing an analysis and synthesis framework for formation observers/estimators that rely on local measurements. The noveltyof our approach hinges upon the explicit examination of the underlying distributed spacecraft network in the realm of guidance, navigation, and control algorithmic analysis and design. The overarching goal of our general research program, some of whose results are reported in this paper, is the development of distributed spacecraft estimation algorithms that are scalable, modular, and robust to variations inthe topology and link characteristics of the formation information exchange network. In this work, we consider the observability of a spacecraft formation from a single observation node and utilize the agreement protocol as a mechanism for observing formation states from local measurements. Specifically, we show how the symmetry structure of the network, characterized in terms of its automorphism group, directly relates to the observability of the corresponding multi-agent system The ramification of this notion of observability over networks is then explored in the context of distributed formation estimation.
Distributed robust finite-time nonlinear consensus protocols for multi-agent systems
NASA Astrophysics Data System (ADS)
Zuo, Zongyu; Tie, Lin
2016-04-01
This paper investigates the robust finite-time consensus problem of multi-agent systems in networks with undirected topology. Global nonlinear consensus protocols augmented with a variable structure are constructed with the aid of Lyapunov functions for each single-integrator agent dynamics in the presence of external disturbances. In particular, it is shown that the finite settling time of the proposed general framework for robust consensus design is upper bounded for any initial condition. This makes it possible for network consensus problems to design and estimate the convergence time offline for a multi-agent team with a given undirected information flow. Finally, simulation results are presented to demonstrate the performance and effectiveness of our finite-time protocols.
NASA Astrophysics Data System (ADS)
Langousis, Andreas; Kaleris, Vassilios; Xeygeni, Vagia; Magkou, Foteini
2017-04-01
Assessing the availability of groundwater reserves at a regional level, requires accurate and robust hydraulic head estimation at multiple locations of an aquifer. To that extent, one needs groundwater observation networks that can provide sufficient information to estimate the hydraulic head at unobserved locations. The density of such networks is largely influenced by the spatial distribution of the hydraulic conductivity in the aquifer, and it is usually determined through trial-and-error, by solving the groundwater flow based on a properly selected set of alternative but physically plausible geologic structures. In this work, we use: 1) dimensional analysis, and b) a pulse-based stochastic model for simulation of synthetic aquifer structures, to calculate the distribution of the absolute error in hydraulic head estimation as a function of the standardized distance from the nearest measuring locations. The resulting distributions are proved to encompass all possible small-scale structural dependencies, exhibiting characteristics (bounds, multi-modal features etc.) that can be explained using simple geometric arguments. The obtained results are promising, pointing towards the direction of establishing design criteria based on large-scale geologic maps.
Local Influence and Robust Procedures for Mediation Analysis
ERIC Educational Resources Information Center
Zu, Jiyun; Yuan, Ke-Hai
2010-01-01
Existing studies of mediation models have been limited to normal-theory maximum likelihood (ML). Because real data in the social and behavioral sciences are seldom normally distributed and often contain outliers, classical methods generally lead to inefficient or biased parameter estimates. Consequently, the conclusions from a mediation analysis…
Distributed Multisensor Data Fusion under Unknown Correlation and Data Inconsistency
Abu Bakr, Muhammad; Lee, Sukhan
2017-01-01
The paradigm of multisensor data fusion has been evolved from a centralized architecture to a decentralized or distributed architecture along with the advancement in sensor and communication technologies. These days, distributed state estimation and data fusion has been widely explored in diverse fields of engineering and control due to its superior performance over the centralized one in terms of flexibility, robustness to failure and cost effectiveness in infrastructure and communication. However, distributed multisensor data fusion is not without technical challenges to overcome: namely, dealing with cross-correlation and inconsistency among state estimates and sensor data. In this paper, we review the key theories and methodologies of distributed multisensor data fusion available to date with a specific focus on handling unknown correlation and data inconsistency. We aim at providing readers with a unifying view out of individual theories and methodologies by presenting a formal analysis of their implications. Finally, several directions of future research are highlighted. PMID:29077035
NASA Astrophysics Data System (ADS)
Jenkins, Colleen; Jordan, Jay; Carlson, Jeff
2007-02-01
This paper presents parameter estimation techniques useful for detecting background changes in a video sequence with extreme foreground activity. A specific application of interest is automated detection of the covert placement of threats (e.g., a briefcase bomb) inside crowded public facilities. We propose that a histogram of pixel intensity acquired from a fixed mounted camera over time for a series of images will be a mixture of two Gaussian functions: the foreground probability distribution function and background probability distribution function. We will use Pearson's Method of Moments to separate the two probability distribution functions. The background function can then be "remembered" and changes in the background can be detected. Subsequent comparisons of background estimates are used to detect changes. Changes are flagged to alert security forces to the presence and location of potential threats. Results are presented that indicate the significant potential for robust parameter estimation techniques as applied to video surveillance.
Optimal design of stimulus experiments for robust discrimination of biochemical reaction networks.
Flassig, R J; Sundmacher, K
2012-12-01
Biochemical reaction networks in the form of coupled ordinary differential equations (ODEs) provide a powerful modeling tool for understanding the dynamics of biochemical processes. During the early phase of modeling, scientists have to deal with a large pool of competing nonlinear models. At this point, discrimination experiments can be designed and conducted to obtain optimal data for selecting the most plausible model. Since biological ODE models have widely distributed parameters due to, e.g. biologic variability or experimental variations, model responses become distributed. Therefore, a robust optimal experimental design (OED) for model discrimination can be used to discriminate models based on their response probability distribution functions (PDFs). In this work, we present an optimal control-based methodology for designing optimal stimulus experiments aimed at robust model discrimination. For estimating the time-varying model response PDF, which results from the nonlinear propagation of the parameter PDF under the ODE dynamics, we suggest using the sigma-point approach. Using the model overlap (expected likelihood) as a robust discrimination criterion to measure dissimilarities between expected model response PDFs, we benchmark the proposed nonlinear design approach against linearization with respect to prediction accuracy and design quality for two nonlinear biological reaction networks. As shown, the sigma-point outperforms the linearization approach in the case of widely distributed parameter sets and/or existing multiple steady states. Since the sigma-point approach scales linearly with the number of model parameter, it can be applied to large systems for robust experimental planning. An implementation of the method in MATLAB/AMPL is available at http://www.uni-magdeburg.de/ivt/svt/person/rf/roed.html. flassig@mpi-magdeburg.mpg.de Supplementary data are are available at Bioinformatics online.
NASA Astrophysics Data System (ADS)
Faure, Guilhem; Koonin, Eugene V.
2015-05-01
Robustness to destabilizing effects of mutations is thought of as a key factor of protein evolution. The connections between two measures of robustness, the relative core size and the computationally estimated effect of mutations on protein stability (ΔΔG), protein abundance and the selection pressure on protein-coding genes (dN/dS) were analyzed for the organisms with a large number of available protein structures including four eukaryotes, two bacteria and one archaeon. The distribution of the effects of mutations in the core on protein stability is universal and indistinguishable in eukaryotes and bacteria, centered at slightly destabilizing amino acid replacements, and with a heavy tail of more strongly destabilizing replacements. The distribution of mutational effects in the hyperthermophilic archaeon Thermococcus gammatolerans is significantly shifted toward strongly destabilizing replacements which is indicative of stronger constraints that are imposed on proteins in hyperthermophiles. The median effect of mutations is strongly, positively correlated with the relative core size, in evidence of the congruence between the two measures of protein robustness. However, both measures show only limited correlations to the expression level and selection pressure on protein-coding genes. Thus, the degree of robustness reflected in the universal distribution of mutational effects appears to be a fundamental, ancient feature of globular protein folds whereas the observed variations are largely neutral and uncoupled from short term protein evolution. A weak anticorrelation between protein core size and selection pressure is observed only for surface residues in prokaryotes but a stronger anticorrelation is observed for all residues in eukaryotic proteins. This substantial difference between proteins of prokaryotes and eukaryotes is likely to stem from the demonstrable higher compactness of prokaryotic proteins.
On the inverse problem of blade design for centrifugal pumps and fans
NASA Astrophysics Data System (ADS)
Kruyt, N. P.; Westra, R. W.
2014-06-01
The inverse problem of blade design for centrifugal pumps and fans has been studied. The solution to this problem provides the geometry of rotor blades that realize specified performance characteristics, together with the corresponding flow field. Here a three-dimensional solution method is described in which the so-called meridional geometry is fixed and the distribution of the azimuthal angle at the three-dimensional blade surface is determined for blades of infinitesimal thickness. The developed formulation is based on potential-flow theory. Besides the blade impermeability condition at the pressure and suction side of the blades, an additional boundary condition at the blade surface is required in order to fix the unknown blade geometry. For this purpose the mean-swirl distribution is employed. The iterative numerical method is based on a three-dimensional finite element method approach in which the flow equations are solved on the domain determined by the latest estimate of the blade geometry, with the mean-swirl distribution boundary condition at the blade surface being enforced. The blade impermeability boundary condition is then used to find an improved estimate of the blade geometry. The robustness of the method is increased by specific techniques, such as spanwise-coupled solution of the discretized impermeability condition and the use of under-relaxation in adjusting the estimates of the blade geometry. Various examples are shown that demonstrate the effectiveness and robustness of the method in finding a solution for the blade geometry of different types of centrifugal pumps and fans. The influence of the employed mean-swirl distribution on the performance characteristics is also investigated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
La Russa, D
Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributionsmore » found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.« less
An estimating equation approach to dimension reduction for longitudinal data
Xu, Kelin; Guo, Wensheng; Xiong, Momiao; Zhu, Liping; Jin, Li
2016-01-01
Sufficient dimension reduction has been extensively explored in the context of independent and identically distributed data. In this article we generalize sufficient dimension reduction to longitudinal data and propose an estimating equation approach to estimating the central mean subspace. The proposed method accounts for the covariance structure within each subject and improves estimation efficiency when the covariance structure is correctly specified. Even if the covariance structure is misspecified, our estimator remains consistent. In addition, our method relaxes distributional assumptions on the covariates and is doubly robust. To determine the structural dimension of the central mean subspace, we propose a Bayesian-type information criterion. We show that the estimated structural dimension is consistent and that the estimated basis directions are root-\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$n$\\end{document} consistent, asymptotically normal and locally efficient. Simulations and an analysis of the Framingham Heart Study data confirm the effectiveness of our approach. PMID:27017956
Robustness of survival estimates for radio-marked animals
Bunck, C.M.; Chen, C.-L.
1992-01-01
Telemetry techniques are often used to study the survival of birds and mammals; particularly whcn mark-recapture approaches are unsuitable. Both parametric and nonparametric methods to estimate survival have becn developed or modified from other applications. An implicit assumption in these approaches is that the probability of re-locating an animal with a functioning transmitter is one. A Monte Carlo study was conducted to determine the bias and variance of the Kaplan-Meier estimator and an estimator based also on the assumption of constant hazard and to eva!uate the performance of the two-sample tests associated with each. Modifications of each estimator which allow a re-Iocation probability of less than one are described and evaluated. Generallv the unmodified estimators were biased but had lower variance. At low sample sizes all estimators performed poorly. Under the null hypothesis, the distribution of all test statistics reasonably approximated the null distribution when survival was low but not when it was high. The power of the two-sample tests were similar.
Accurate motion parameter estimation for colonoscopy tracking using a regression method
NASA Astrophysics Data System (ADS)
Liu, Jianfei; Subramanian, Kalpathi R.; Yoo, Terry S.
2010-03-01
Co-located optical and virtual colonoscopy images have the potential to provide important clinical information during routine colonoscopy procedures. In our earlier work, we presented an optical flow based algorithm to compute egomotion from live colonoscopy video, permitting navigation and visualization of the corresponding patient anatomy. In the original algorithm, motion parameters were estimated using the traditional Least Sum of squares(LS) procedure which can be unstable in the context of optical flow vectors with large errors. In the improved algorithm, we use the Least Median of Squares (LMS) method, a robust regression method for motion parameter estimation. Using the LMS method, we iteratively analyze and converge toward the main distribution of the flow vectors, while disregarding outliers. We show through three experiments the improvement in tracking results obtained using the LMS method, in comparison to the LS estimator. The first experiment demonstrates better spatial accuracy in positioning the virtual camera in the sigmoid colon. The second and third experiments demonstrate the robustness of this estimator, resulting in longer tracked sequences: from 300 to 1310 in the ascending colon, and 410 to 1316 in the transverse colon.
McGee, Monnie; Chen, Zhongxue
2006-01-01
There are many methods of correcting microarray data for non-biological sources of error. Authors routinely supply software or code so that interested analysts can implement their methods. Even with a thorough reading of associated references, it is not always clear how requisite parts of the method are calculated in the software packages. However, it is important to have an understanding of such details, as this understanding is necessary for proper use of the output, or for implementing extensions to the model. In this paper, the calculation of parameter estimates used in Robust Multichip Average (RMA), a popular preprocessing algorithm for Affymetrix GeneChip brand microarrays, is elucidated. The background correction method for RMA assumes that the perfect match (PM) intensities observed result from a convolution of the true signal, assumed to be exponentially distributed, and a background noise component, assumed to have a normal distribution. A conditional expectation is calculated to estimate signal. Estimates of the mean and variance of the normal distribution and the rate parameter of the exponential distribution are needed to calculate this expectation. Simulation studies show that the current estimates are flawed; therefore, new ones are suggested. We examine the performance of preprocessing under the exponential-normal convolution model using several different methods to estimate the parameters.
New Approaches to Robust Confidence Intervals for Location: A Simulation Study.
1984-06-01
obtain a denominator for the test statistic. Those statistics based on location estimates derived from Hampel’s redescending influence function or v...defined an influence function for a test in terms of the behavior of its P-values when the data are sampled from a model distribution modified by point...proposal could be used for interval estimation as well as hypothesis testing, the extension is immediate. Once an influence function has been defined
A distributed automatic target recognition system using multiple low resolution sensors
NASA Astrophysics Data System (ADS)
Yue, Zhanfeng; Lakshmi Narasimha, Pramod; Topiwala, Pankaj
2008-04-01
In this paper, we propose a multi-agent system which uses swarming techniques to perform high accuracy Automatic Target Recognition (ATR) in a distributed manner. The proposed system can co-operatively share the information from low-resolution images of different looks and use this information to perform high accuracy ATR. An advanced, multiple-agent Unmanned Aerial Vehicle (UAV) systems-based approach is proposed which integrates the processing capabilities, combines detection reporting with live video exchange, and swarm behavior modalities that dramatically surpass individual sensor system performance levels. We employ real-time block-based motion analysis and compensation scheme for efficient estimation and correction of camera jitter, global motion of the camera/scene and the effects of atmospheric turbulence. Our optimized Partition Weighted Sum (PWS) approach requires only bitshifts and additions, yet achieves a stunning 16X pixel resolution enhancement, which is moreover parallizable. We develop advanced, adaptive particle-filtering based algorithms to robustly track multiple mobile targets by adaptively changing the appearance model of the selected targets. The collaborative ATR system utilizes the homographies between the sensors induced by the ground plane to overlap the local observation with the received images from other UAVs. The motion of the UAVs distorts estimated homography frame to frame. A robust dynamic homography estimation algorithm is proposed to address this, by using the homography decomposition and the ground plane surface estimation.
An evaluation of sex-age-kill (SAK) model performance
Millspaugh, Joshua J.; Skalski, John R.; Townsend, Richard L.; Diefenbach, Duane R.; Boyce, Mark S.; Hansen, Lonnie P.; Kammermeyer, Kent
2009-01-01
The sex-age-kill (SAK) model is widely used to estimate abundance of harvested large mammals, including white-tailed deer (Odocoileus virginianus). Despite a long history of use, few formal evaluations of SAK performance exist. We investigated how violations of the stable age distribution and stationary population assumption, changes to male or female harvest, stochastic effects (i.e., random fluctuations in recruitment and survival), and sampling efforts influenced SAK estimation. When the simulated population had a stable age distribution and λ > 1, the SAK model underestimated abundance. Conversely, when λ < 1, the SAK overestimated abundance. When changes to male harvest were introduced, SAK estimates were opposite the true population trend. In contrast, SAK estimates were robust to changes in female harvest rates. Stochastic effects caused SAK estimates to fluctuate about their equilibrium abundance, but the effect dampened as the size of the surveyed population increased. When we considered both stochastic effects and sampling error at a deer management unit scale the resultant abundance estimates were within ±121.9% of the true population level 95% of the time. These combined results demonstrate extreme sensitivity to model violations and scale of analysis. Without changes to model formulation, the SAK model will be biased when λ ≠ 1. Furthermore, any factor that alters the male harvest rate, such as changes to regulations or changes in hunter attitudes, will bias population estimates. Sex-age-kill estimates may be precise at large spatial scales, such as the state level, but less so at the individual management unit level. Alternative models, such as statistical age-at-harvest models, which require similar data types, might allow for more robust, broad-scale demographic assessments.
Plant Distribution Data Show Broader Climatic Limits than Expert-Based Climatic Tolerance Estimates
Curtis, Caroline A.; Bradley, Bethany A.
2016-01-01
Background Although increasingly sophisticated environmental measures are being applied to species distributions models, the focus remains on using climatic data to provide estimates of habitat suitability. Climatic tolerance estimates based on expert knowledge are available for a wide range of plants via the USDA PLANTS database. We aim to test how climatic tolerance inferred from plant distribution records relates to tolerance estimated by experts. Further, we use this information to identify circumstances when species distributions are more likely to approximate climatic tolerance. Methods We compiled expert knowledge estimates of minimum and maximum precipitation and minimum temperature tolerance for over 1800 conservation plant species from the ‘plant characteristics’ information in the USDA PLANTS database. We derived climatic tolerance from distribution data downloaded from the Global Biodiversity and Information Facility (GBIF) and corresponding climate from WorldClim. We compared expert-derived climatic tolerance to empirical estimates to find the difference between their inferred climate niches (ΔCN), and tested whether ΔCN was influenced by growth form or range size. Results Climate niches calculated from distribution data were significantly broader than expert-based tolerance estimates (Mann-Whitney p values << 0.001). The average plant could tolerate 24 mm lower minimum precipitation, 14 mm higher maximum precipitation, and 7° C lower minimum temperatures based on distribution data relative to expert-based tolerance estimates. Species with larger ranges had greater ΔCN for minimum precipitation and minimum temperature. For maximum precipitation and minimum temperature, forbs and grasses tended to have larger ΔCN while grasses and trees had larger ΔCN for minimum precipitation. Conclusion Our results show that distribution data are consistently broader than USDA PLANTS experts’ knowledge and likely provide more robust estimates of climatic tolerance, especially for widespread forbs and grasses. These findings suggest that widely available expert-based climatic tolerance estimates underrepresent species’ fundamental niche and likely fail to capture the realized niche. PMID:27870859
Estimation of descriptive statistics for multiply censored water quality data
Helsel, Dennis R.; Cohn, Timothy A.
1988-01-01
This paper extends the work of Gilliom and Helsel (1986) on procedures for estimating descriptive statistics of water quality data that contain “less than” observations. Previously, procedures were evaluated when only one detection limit was present. Here we investigate the performance of estimators for data that have multiple detection limits. Probability plotting and maximum likelihood methods perform substantially better than simple substitution procedures now commonly in use. Therefore simple substitution procedures (e.g., substitution of the detection limit) should be avoided. Probability plotting methods are more robust than maximum likelihood methods to misspecification of the parent distribution and their use should be encouraged in the typical situation where the parent distribution is unknown. When utilized correctly, less than values frequently contain nearly as much information for estimating population moments and quantiles as would the same observations had the detection limit been below them.
Robust, automatic GPS station velocities and velocity time series
NASA Astrophysics Data System (ADS)
Blewitt, G.; Kreemer, C.; Hammond, W. C.
2014-12-01
Automation in GPS coordinate time series analysis makes results more objective and reproducible, but not necessarily as robust as the human eye to detect problems. Moreover, it is not a realistic option to manually scan our current load of >20,000 time series per day. This motivates us to find an automatic way to estimate station velocities that is robust to outliers, discontinuities, seasonality, and noise characteristics (e.g., heteroscedasticity). Here we present a non-parametric method based on the Theil-Sen estimator, defined as the median of velocities vij=(xj-xi)/(tj-ti) computed between all pairs (i, j). Theil-Sen estimators produce statistically identical solutions to ordinary least squares for normally distributed data, but they can tolerate up to 29% of data being problematic. To mitigate seasonality, our proposed estimator only uses pairs approximately separated by an integer number of years (N-δt)<(tj-ti )<(N+δt), where δt is chosen to be small enough to capture seasonality, yet large enough to reduce random error. We fix N=1 to maximally protect against discontinuities. In addition to estimating an overall velocity, we also use these pairs to estimate velocity time series. To test our methods, we process real data sets that have already been used with velocities published in the NA12 reference frame. Accuracy can be tested by the scatter of horizontal velocities in the North American plate interior, which is known to be stable to ~0.3 mm/yr. This presents new opportunities for time series interpretation. For example, the pattern of velocity variations at the interannual scale can help separate tectonic from hydrological processes. Without any step detection, velocity estimates prove to be robust for stations affected by the Mw7.2 2010 El Mayor-Cucapah earthquake, and velocity time series show a clear change after the earthquake, without any of the usual parametric constraints, such as relaxation of postseismic velocities to their preseismic values.
Comparison of mode estimation methods and application in molecular clock analysis
NASA Technical Reports Server (NTRS)
Hedges, S. Blair; Shah, Prachi
2003-01-01
BACKGROUND: Distributions of time estimates in molecular clock studies are sometimes skewed or contain outliers. In those cases, the mode is a better estimator of the overall time of divergence than the mean or median. However, different methods are available for estimating the mode. We compared these methods in simulations to determine their strengths and weaknesses and further assessed their performance when applied to real data sets from a molecular clock study. RESULTS: We found that the half-range mode and robust parametric mode methods have a lower bias than other mode methods under a diversity of conditions. However, the half-range mode suffers from a relatively high variance and the robust parametric mode is more susceptible to bias by outliers. We determined that bootstrapping reduces the variance of both mode estimators. Application of the different methods to real data sets yielded results that were concordant with the simulations. CONCLUSION: Because the half-range mode is a simple and fast method, and produced less bias overall in our simulations, we recommend the bootstrapped version of it as a general-purpose mode estimator and suggest a bootstrap method for obtaining the standard error and 95% confidence interval of the mode.
Han, Fang; Liu, Han
2017-02-01
Correlation matrix plays a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson's sample correlation matrix. Although Pearson's sample correlation matrix enjoys various good properties under Gaussian models, its not an effective estimator when facing heavy-tail distributions with possible outliers. As a robust alternative, Han and Liu (2013b) advocated the use of a transformed version of the Kendall's tau sample correlation matrix in estimating high dimensional latent generalized correlation matrix under the transelliptical distribution family (or elliptical copula). The transelliptical family assumes that after unspecified marginal monotone transformations, the data follow an elliptical distribution. In this paper, we study the theoretical properties of the Kendall's tau sample correlation matrix and its transformed version proposed in Han and Liu (2013b) for estimating the population Kendall's tau correlation matrix and the latent Pearson's correlation matrix under both spectral and restricted spectral norms. With regard to the spectral norm, we highlight the role of "effective rank" in quantifying the rate of convergence. With regard to the restricted spectral norm, we for the first time present a "sign subgaussian condition" which is sufficient to guarantee that the rank-based correlation matrix estimator attains the optimal rate of convergence. In both cases, we do not need any moment condition.
M-estimation for robust sparse unmixing of hyperspectral images
NASA Astrophysics Data System (ADS)
Toomik, Maria; Lu, Shijian; Nelson, James D. B.
2016-10-01
Hyperspectral unmixing methods often use a conventional least squares based lasso which assumes that the data follows the Gaussian distribution. The normality assumption is an approximation which is generally invalid for real imagery data. We consider a robust (non-Gaussian) approach to sparse spectral unmixing of remotely sensed imagery which reduces the sensitivity of the estimator to outliers and relaxes the linearity assumption. The method consists of several appropriate penalties. We propose to use an lp norm with 0 < p < 1 in the sparse regression problem, which induces more sparsity in the results, but makes the problem non-convex. On the other hand, the problem, though non-convex, can be solved quite straightforwardly with an extensible algorithm based on iteratively reweighted least squares. To deal with the huge size of modern spectral libraries we introduce a library reduction step, similar to the multiple signal classification (MUSIC) array processing algorithm, which not only speeds up unmixing but also yields superior results. In the hyperspectral setting we extend the traditional least squares method to the robust heavy-tailed case and propose a generalised M-lasso solution. M-estimation replaces the Gaussian likelihood with a fixed function ρ(e) that restrains outliers. The M-estimate function reduces the effect of errors with large amplitudes or even assigns the outliers zero weights. Our experimental results on real hyperspectral data show that noise with large amplitudes (outliers) often exists in the data. This ability to mitigate the influence of such outliers can therefore offer greater robustness. Qualitative hyperspectral unmixing results on real hyperspectral image data corroborate the efficacy of the proposed method.
Nonparametric estimation of plant density by the distance method
Patil, S.A.; Burnham, K.P.; Kovner, J.L.
1979-01-01
A relation between the plant density and the probability density function of the nearest neighbor distance (squared) from a random point is established under fairly broad conditions. Based upon this relationship, a nonparametric estimator for the plant density is developed and presented in terms of order statistics. Consistency and asymptotic normality of the estimator are discussed. An interval estimator for the density is obtained. The modifications of this estimator and its variance are given when the distribution is truncated. Simulation results are presented for regular, random and aggregated populations to illustrate the nonparametric estimator and its variance. A numerical example from field data is given. Merits and deficiencies of the estimator are discussed with regard to its robustness and variance.
A spatially explicit capture-recapture estimator for single-catch traps.
Distiller, Greg; Borchers, David L
2015-11-01
Single-catch traps are frequently used in live-trapping studies of small mammals. Thus far, a likelihood for single-catch traps has proven elusive and usually the likelihood for multicatch traps is used for spatially explicit capture-recapture (SECR) analyses of such data. Previous work found the multicatch likelihood to provide a robust estimator of average density. We build on a recently developed continuous-time model for SECR to derive a likelihood for single-catch traps. We use this to develop an estimator based on observed capture times and compare its performance by simulation to that of the multicatch estimator for various scenarios with nonconstant density surfaces. While the multicatch estimator is found to be a surprisingly robust estimator of average density, its performance deteriorates with high trap saturation and increasing density gradients. Moreover, it is found to be a poor estimator of the height of the detection function. By contrast, the single-catch estimators of density, distribution, and detection function parameters are found to be unbiased or nearly unbiased in all scenarios considered. This gain comes at the cost of higher variance. If there is no interest in interpreting the detection function parameters themselves, and if density is expected to be fairly constant over the survey region, then the multicatch estimator performs well with single-catch traps. However if accurate estimation of the detection function is of interest, or if density is expected to vary substantially in space, then there is merit in using the single-catch estimator when trap saturation is above about 60%. The estimator's performance is improved if care is taken to place traps so as to span the range of variables that affect animal distribution. As a single-catch likelihood with unknown capture times remains intractable for now, researchers using single-catch traps should aim to incorporate timing devices with their traps.
Meng, Qing-Hao; Yang, Wei-Xing; Wang, Yang; Zeng, Ming
2011-01-01
This paper addresses the collective odor source localization (OSL) problem in a time-varying airflow environment using mobile robots. A novel OSL methodology which combines odor-source probability estimation and multiple robots' search is proposed. The estimation phase consists of two steps: firstly, the separate probability-distribution map of odor source is estimated via Bayesian rules and fuzzy inference based on a single robot's detection events; secondly, the separate maps estimated by different robots at different times are fused into a combined map by way of distance based superposition. The multi-robot search behaviors are coordinated via a particle swarm optimization algorithm, where the estimated odor-source probability distribution is used to express the fitness functions. In the process of OSL, the estimation phase provides the prior knowledge for the searching while the searching verifies the estimation results, and both phases are implemented iteratively. The results of simulations for large-scale advection-diffusion plume environments and experiments using real robots in an indoor airflow environment validate the feasibility and robustness of the proposed OSL method.
Meng, Qing-Hao; Yang, Wei-Xing; Wang, Yang; Zeng, Ming
2011-01-01
This paper addresses the collective odor source localization (OSL) problem in a time-varying airflow environment using mobile robots. A novel OSL methodology which combines odor-source probability estimation and multiple robots’ search is proposed. The estimation phase consists of two steps: firstly, the separate probability-distribution map of odor source is estimated via Bayesian rules and fuzzy inference based on a single robot’s detection events; secondly, the separate maps estimated by different robots at different times are fused into a combined map by way of distance based superposition. The multi-robot search behaviors are coordinated via a particle swarm optimization algorithm, where the estimated odor-source probability distribution is used to express the fitness functions. In the process of OSL, the estimation phase provides the prior knowledge for the searching while the searching verifies the estimation results, and both phases are implemented iteratively. The results of simulations for large-scale advection–diffusion plume environments and experiments using real robots in an indoor airflow environment validate the feasibility and robustness of the proposed OSL method. PMID:22346650
Robust estimators of palaeosecular variation
NASA Astrophysics Data System (ADS)
Suttie, Neil; Biggin, Andrew; Holme, Richard
2015-02-01
The Fisher distribution is central to palaeomagnetism but presents several problems when used to characterize geomagnetic field directions as observed in sequences of volcanic rocks. First, it introduces a shallowing effect when used to define the mean of any group of directional unit vectors. This is problematic because it can suggest the presence of persistent non-axial dipole components when none are present. More importantly, it fails to capture the observed `long tail' in distributions of both directions and associated virtual geomagnetic poles in terms of angular distance from a central direction. To achieve a good fit to data, it therefore requires the introduction of a second distribution (and therefore the estimation of additional parameters) or the arbitrary removal of data. Here we present a new distribution to describe palaeomagnetic directions and demonstrate that it overcomes both of these problems, generating robust indicators of both the central direction (or pole position) and the spread of palaeomagnetic data as defined by unit vectors. Starting from the assumption that poles (or directions) have an expected colatitude, rather than a mean location, we derive the spherical exponential distribution. We demonstrate that this new distribution provides a good fit to palaeomagnetic data sets from seven large igneous provinces between 15 and 65 Ma and also those produced by numerical dynamo models. We also use it to derive a new shape parameter which may be used as a diagnostic tool for testing goodness of fit of models to data and use this to argue for a shift in geomagnetic behaviour between 5 and 15 Ma. Furthermore, we point out that this new statistic can be used to determine the most appropriate distribution to be used when constructing confidence limits for poles.
Data-Adaptive Bias-Reduced Doubly Robust Estimation.
Vermeulen, Karel; Vansteelandt, Stijn
2016-05-01
Doubly robust estimators have now been proposed for a variety of target parameters in the causal inference and missing data literature. These consistently estimate the parameter of interest under a semiparametric model when one of two nuisance working models is correctly specified, regardless of which. The recently proposed bias-reduced doubly robust estimation procedure aims to partially retain this robustness in more realistic settings where both working models are misspecified. These so-called bias-reduced doubly robust estimators make use of special (finite-dimensional) nuisance parameter estimators that are designed to locally minimize the squared asymptotic bias of the doubly robust estimator in certain directions of these finite-dimensional nuisance parameters under misspecification of both parametric working models. In this article, we extend this idea to incorporate the use of data-adaptive estimators (infinite-dimensional nuisance parameters), by exploiting the bias reduction estimation principle in the direction of only one nuisance parameter. We additionally provide an asymptotic linearity theorem which gives the influence function of the proposed doubly robust estimator under correct specification of a parametric nuisance working model for the missingness mechanism/propensity score but a possibly misspecified (finite- or infinite-dimensional) outcome working model. Simulation studies confirm the desirable finite-sample performance of the proposed estimators relative to a variety of other doubly robust estimators.
Nagata, Motoki; Hirata, Yoshito; Fujiwara, Naoya; Tanaka, Gouhei; Suzuki, Hideyuki; Aihara, Kazuyuki
2017-03-01
In this paper, we show that spatial correlation of renewable energy outputs greatly influences the robustness of the power grids against large fluctuations of the effective power. First, we evaluate the spatial correlation among renewable energy outputs. We find that the spatial correlation of renewable energy outputs depends on the locations, while the influence of the spatial correlation of renewable energy outputs on power grids is not well known. Thus, second, by employing the topology of the power grid in eastern Japan, we analyze the robustness of the power grid with spatial correlation of renewable energy outputs. The analysis is performed by using a realistic differential-algebraic equations model. The results show that the spatial correlation of the energy resources strongly degrades the robustness of the power grid. Our results suggest that we should consider the spatial correlation of the renewable energy outputs when estimating the stability of power grids.
A random effects meta-analysis model with Box-Cox transformation.
Yamaguchi, Yusuke; Maruo, Kazushi; Partlett, Christopher; Riley, Richard D
2017-07-19
In a random effects meta-analysis model, true treatment effects for each study are routinely assumed to follow a normal distribution. However, normality is a restrictive assumption and the misspecification of the random effects distribution may result in a misleading estimate of overall mean for the treatment effect, an inappropriate quantification of heterogeneity across studies and a wrongly symmetric prediction interval. We focus on problems caused by an inappropriate normality assumption of the random effects distribution, and propose a novel random effects meta-analysis model where a Box-Cox transformation is applied to the observed treatment effect estimates. The proposed model aims to normalise an overall distribution of observed treatment effect estimates, which is sum of the within-study sampling distributions and the random effects distribution. When sampling distributions are approximately normal, non-normality in the overall distribution will be mainly due to the random effects distribution, especially when the between-study variation is large relative to the within-study variation. The Box-Cox transformation addresses this flexibly according to the observed departure from normality. We use a Bayesian approach for estimating parameters in the proposed model, and suggest summarising the meta-analysis results by an overall median, an interquartile range and a prediction interval. The model can be applied for any kind of variables once the treatment effect estimate is defined from the variable. A simulation study suggested that when the overall distribution of treatment effect estimates are skewed, the overall mean and conventional I 2 from the normal random effects model could be inappropriate summaries, and the proposed model helped reduce this issue. We illustrated the proposed model using two examples, which revealed some important differences on summary results, heterogeneity measures and prediction intervals from the normal random effects model. The random effects meta-analysis with the Box-Cox transformation may be an important tool for examining robustness of traditional meta-analysis results against skewness on the observed treatment effect estimates. Further critical evaluation of the method is needed.
Uehara, Takashi; Sartori, Matteo; Tanaka, Toshihisa; Fiori, Simone
2017-06-01
The estimation of covariance matrices is of prime importance to analyze the distribution of multivariate signals. In motor imagery-based brain-computer interfaces (MI-BCI), covariance matrices play a central role in the extraction of features from recorded electroencephalograms (EEGs); therefore, correctly estimating covariance is crucial for EEG classification. This letter discusses algorithms to average sample covariance matrices (SCMs) for the selection of the reference matrix in tangent space mapping (TSM)-based MI-BCI. Tangent space mapping is a powerful method of feature extraction and strongly depends on the selection of a reference covariance matrix. In general, the observed signals may include outliers; therefore, taking the geometric mean of SCMs as the reference matrix may not be the best choice. In order to deal with the effects of outliers, robust estimators have to be used. In particular, we discuss and test the use of geometric medians and trimmed averages (defined on the basis of several metrics) as robust estimators. The main idea behind trimmed averages is to eliminate data that exhibit the largest distance from the average covariance calculated on the basis of all available data. The results of the experiments show that while the geometric medians show little differences from conventional methods in terms of classification accuracy in the classification of electroencephalographic recordings, the trimmed averages show significant improvement for all subjects.
Robust Gaussian Graphical Modeling via l1 Penalization
Sun, Hokeun; Li, Hongzhe
2012-01-01
Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. PMID:23020775
Jiang, Shenghang; Park, Seongjin; Challapalli, Sai Divya; Fei, Jingyi; Wang, Yong
2017-01-01
We report a robust nonparametric descriptor, J′(r), for quantifying the density of clustering molecules in single-molecule localization microscopy. J′(r), based on nearest neighbor distribution functions, does not require any parameter as an input for analyzing point patterns. We show that J′(r) displays a valley shape in the presence of clusters of molecules, and the characteristics of the valley reliably report the clustering features in the data. Most importantly, the position of the J′(r) valley (rJm′) depends exclusively on the density of clustering molecules (ρc). Therefore, it is ideal for direct estimation of the clustering density of molecules in single-molecule localization microscopy. As an example, this descriptor was applied to estimate the clustering density of ptsG mRNA in E. coli bacteria. PMID:28636661
Zheng, Wenjing; van der Laan, Mark
2017-01-01
In this paper, we study the effect of a time-varying exposure mediated by a time-varying intermediate variable. We consider general longitudinal settings, including survival outcomes. At a given time point, the exposure and mediator of interest are influenced by past covariates, mediators and exposures, and affect future covariates, mediators and exposures. Right censoring, if present, occurs in response to past history. To address the challenges in mediation analysis that are unique to these settings, we propose a formulation in terms of random interventions based on conditional distributions for the mediator. This formulation, in particular, allows for well-defined natural direct and indirect effects in the survival setting, and natural decomposition of the standard total effect. Upon establishing identifiability and the corresponding statistical estimands, we derive the efficient influence curves and establish their robustness properties. Applying Targeted Maximum Likelihood Estimation, we use these efficient influence curves to construct multiply robust and efficient estimators. We also present an inverse probability weighted estimator and a nested non-targeted substitution estimator for these parameters. PMID:29387520
Aerial Surveys Give New Estimates for Orangutans in Sabah, Malaysia
Gimenez, Olivier; Ambu, Laurentius; Ancrenaz, Karine; Andau, Patrick; Goossens, Benoît; Payne, John; Sawang, Azri; Tuuga, Augustine; Lackman-Ancrenaz, Isabelle
2005-01-01
Great apes are threatened with extinction, but precise information about the distribution and size of most populations is currently lacking. We conducted orangutan nest counts in the Malaysian state of Sabah (North Borneo), using a combination of ground and helicopter surveys, and provided a way to estimate the current distribution and size of the populations living throughout the entire state. We show that the number of nests detected during aerial surveys is directly related to the estimated true animal density and that a helicopter is an efficient tool to provide robust estimates of orangutan numbers. Our results reveal that with a total estimated population size of about 11,000 individuals, Sabah is one of the main strongholds for orangutans in North Borneo. More than 60% of orangutans living in the state occur outside protected areas, in production forests that have been through several rounds of logging extraction and are still exploited for timber. The role of exploited forests clearly merits further investigation for orangutan conservation in Sabah. PMID:15630475
A game theory approach to target tracking in sensor networks.
Gu, Dongbing
2011-02-01
In this paper, we investigate a moving-target tracking problem with sensor networks. Each sensor node has a sensor to observe the target and a processor to estimate the target position. It also has wireless communication capability but with limited range and can only communicate with neighbors. The moving target is assumed to be an intelligent agent, which is "smart" enough to escape from the detection by maximizing the estimation error. This adversary behavior makes the target tracking problem more difficult. We formulate this target estimation problem as a zero-sum game in this paper and use a minimax filter to estimate the target position. The minimax filter is a robust filter that minimizes the estimation error by considering the worst case noise. Furthermore, we develop a distributed version of the minimax filter for multiple sensor nodes. The distributed computation is implemented via modeling the information received from neighbors as measurements in the minimax filter. The simulation results show that the target tracking algorithm proposed in this paper provides a satisfactory result.
EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.
Tong, Xiaoxiao; Bentler, Peter M
2013-01-01
Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Distribution path robust optimization of electric vehicle with multiple distribution centers
Hao, Wei; He, Ruichun; Jia, Xiaoyan; Pan, Fuquan; Fan, Jing; Xiong, Ruiqi
2018-01-01
To identify electrical vehicle (EV) distribution paths with high robustness, insensitivity to uncertainty factors, and detailed road-by-road schemes, optimization of the distribution path problem of EV with multiple distribution centers and considering the charging facilities is necessary. With the minimum transport time as the goal, a robust optimization model of EV distribution path with adjustable robustness is established based on Bertsimas’ theory of robust discrete optimization. An enhanced three-segment genetic algorithm is also developed to solve the model, such that the optimal distribution scheme initially contains all road-by-road path data using the three-segment mixed coding and decoding method. During genetic manipulation, different interlacing and mutation operations are carried out on different chromosomes, while, during population evolution, the infeasible solution is naturally avoided. A part of the road network of Xifeng District in Qingyang City is taken as an example to test the model and the algorithm in this study, and the concrete transportation paths are utilized in the final distribution scheme. Therefore, more robust EV distribution paths with multiple distribution centers can be obtained using the robust optimization model. PMID:29518169
Polynomial chaos representation of databases on manifolds
DOE Office of Scientific and Technical Information (OSTI.GOV)
Soize, C., E-mail: christian.soize@univ-paris-est.fr; Ghanem, R., E-mail: ghanem@usc.edu
2017-04-15
Characterizing the polynomial chaos expansion (PCE) of a vector-valued random variable with probability distribution concentrated on a manifold is a relevant problem in data-driven settings. The probability distribution of such random vectors is multimodal in general, leading to potentially very slow convergence of the PCE. In this paper, we build on a recent development for estimating and sampling from probabilities concentrated on a diffusion manifold. The proposed methodology constructs a PCE of the random vector together with an associated generator that samples from the target probability distribution which is estimated from data concentrated in the neighborhood of the manifold. Themore » method is robust and remains efficient for high dimension and large datasets. The resulting polynomial chaos construction on manifolds permits the adaptation of many uncertainty quantification and statistical tools to emerging questions motivated by data-driven queries.« less
Concordance measure and discriminatory accuracy in transformation cure models.
Zhang, Yilong; Shao, Yongzhao
2018-01-01
Many populations of early-stage cancer patients have non-negligible latent cure fractions that can be modeled using transformation cure models. However, there is a lack of statistical metrics to evaluate prognostic utility of biomarkers in this context due to the challenges associated with unknown cure status and heavy censorship. In this article, we develop general concordance measures as evaluation metrics for the discriminatory accuracy of transformation cure models including the so-called promotion time cure models and mixture cure models. We introduce explicit formulas for the consistent estimates of the concordance measures, and show that their asymptotically normal distributions do not depend on the unknown censoring distribution. The estimates work for both parametric and semiparametric transformation models as well as transformation cure models. Numerical feasibility of the estimates and their robustness to the censoring distributions are illustrated via simulation studies and demonstrated using a melanoma data set. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Magis, David
2014-11-01
In item response theory, the classical estimators of ability are highly sensitive to response disturbances and can return strongly biased estimates of the true underlying ability level. Robust methods were introduced to lessen the impact of such aberrant responses on the estimation process. The computation of asymptotic (i.e., large-sample) standard errors (ASE) for these robust estimators, however, has not yet been fully considered. This paper focuses on a broad class of robust ability estimators, defined by an appropriate selection of the weight function and the residual measure, for which the ASE is derived from the theory of estimating equations. The maximum likelihood (ML) and the robust estimators, together with their estimated ASEs, are then compared in a simulation study by generating random guessing disturbances. It is concluded that both the estimators and their ASE perform similarly in the absence of random guessing, while the robust estimator and its estimated ASE are less biased and outperform their ML counterparts in the presence of random guessing with large impact on the item response process. © 2013 The British Psychological Society.
Zhu, Zhengfei; Liu, Wei; Gillin, Michael; Gomez, Daniel R; Komaki, Ritsuko; Cox, James D; Mohan, Radhe; Chang, Joe Y
2014-05-06
We assessed the robustness of passive scattering proton therapy (PSPT) plans for patients in a phase II trial of PSPT for stage III non-small cell lung cancer (NSCLC) by using the worst-case scenario method, and compared the worst-case dose distributions with the appearance of locally recurrent lesions. Worst-case dose distributions were generated for each of 9 patients who experienced recurrence after concurrent chemotherapy and PSPT to 74 Gy(RBE) for stage III NSCLC by simulating and incorporating uncertainties associated with set-up, respiration-induced organ motion, and proton range in the planning process. The worst-case CT scans were then fused with the positron emission tomography (PET) scans to locate the recurrence. Although the volumes enclosed by the prescription isodose lines in the worst-case dose distributions were consistently smaller than enclosed volumes in the nominal plans, the target dose coverage was not significantly affected: only one patient had a recurrence outside the prescription isodose lines in the worst-case plan. PSPT is a relatively robust technique. Local recurrence was not associated with target underdosage resulting from estimated uncertainties in 8 of 9 cases.
Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling
NASA Astrophysics Data System (ADS)
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-07-01
What is the "best" model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, Mk; k=>(1,…,K>), and help identify which model is most supported by the observed data, Y>˜=>(y˜1,…,y˜n>). Here, we introduce a new and robust estimator of the model evidence, p>(Y>˜|Mk>), which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, p>(Y>˜|Mk>) is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of p>(Y>˜|Mk>) and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection.
Schwacke, Lori H; Hall, Ailsa J; Townsend, Forrest I; Wells, Randall S; Hansen, Larry J; Hohn, Aleta A; Bossart, Gregory D; Fair, Patricia A; Rowles, Teresa K
2009-08-01
To develop robust reference intervals for hematologic and serum biochemical variables by use of data derived from free-ranging bottlenose dolphins (Tursiops truncatus) and examine potential variation in distributions of clinicopathologic values related to sampling sites' geographic locations. 255 free-ranging bottlenose dolphins. Data from samples collected during multiple bottlenose dolphin capture-release projects conducted at 4 southeastern US coastal locations in 2000 through 2006 were combined to determine reference intervals for 52 clinicopathologic variables. A nonparametric bootstrap approach was applied to estimate 95th percentiles and associated 90% confidence intervals; the need for partitioning by length and sex classes was determined by testing for differences in estimated thresholds with a bootstrap method. When appropriate, quantile regression was used to determine continuous functions for 95th percentiles dependent on length. The proportion of out-of-range samples for all clinicopathologic measurements was examined for each geographic site, and multivariate ANOVA was applied to further explore variation in leukocyte subgroups. A need for partitioning by length and sex classes was indicated for many clinicopathologic variables. For each geographic site, few significant deviations from expected number of out-of-range samples were detected. Although mean leukocyte counts did not vary among sites, differences in the mean counts for leukocyte subgroups were identified. Although differences in the centrality of distributions for some variables were detected, the 95th percentiles estimated from the pooled data were robust and applicable across geographic sites. The derived reference intervals provide critical information for conducting bottlenose dolphin population health studies.
Research reactor loading pattern optimization using estimation of distribution algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, S.; Ziver, K.; AMCG Group, RM Consultants, Abingdon
2006-07-01
A new evolutionary search based approach for solving the nuclear reactor loading pattern optimization problems is presented based on the Estimation of Distribution Algorithms. The optimization technique developed is then applied to the maximization of the effective multiplication factor (K{sub eff}) of the Imperial College CONSORT research reactor (the last remaining civilian research reactor in the United Kingdom). A new elitism-guided searching strategy has been developed and applied to improve the local convergence together with some problem-dependent information based on the 'stand-alone K{sub eff} with fuel coupling calculations. A comparison study between the EDAs and a Genetic Algorithm with Heuristicmore » Tie Breaking Crossover operator has shown that the new algorithm is efficient and robust. (authors)« less
Han, Fang; Liu, Han
2016-01-01
Correlation matrix plays a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson’s sample correlation matrix. Although Pearson’s sample correlation matrix enjoys various good properties under Gaussian models, its not an effective estimator when facing heavy-tail distributions with possible outliers. As a robust alternative, Han and Liu (2013b) advocated the use of a transformed version of the Kendall’s tau sample correlation matrix in estimating high dimensional latent generalized correlation matrix under the transelliptical distribution family (or elliptical copula). The transelliptical family assumes that after unspecified marginal monotone transformations, the data follow an elliptical distribution. In this paper, we study the theoretical properties of the Kendall’s tau sample correlation matrix and its transformed version proposed in Han and Liu (2013b) for estimating the population Kendall’s tau correlation matrix and the latent Pearson’s correlation matrix under both spectral and restricted spectral norms. With regard to the spectral norm, we highlight the role of “effective rank” in quantifying the rate of convergence. With regard to the restricted spectral norm, we for the first time present a “sign subgaussian condition” which is sufficient to guarantee that the rank-based correlation matrix estimator attains the optimal rate of convergence. In both cases, we do not need any moment condition. PMID:28337068
Secure Fusion Estimation for Bandwidth Constrained Cyber-Physical Systems Under Replay Attacks.
Chen, Bo; Ho, Daniel W C; Hu, Guoqiang; Yu, Li; Bo Chen; Ho, Daniel W C; Guoqiang Hu; Li Yu; Chen, Bo; Ho, Daniel W C; Hu, Guoqiang; Yu, Li
2018-06-01
State estimation plays an essential role in the monitoring and supervision of cyber-physical systems (CPSs), and its importance has made the security and estimation performance a major concern. In this case, multisensor information fusion estimation (MIFE) provides an attractive alternative to study secure estimation problems because MIFE can potentially improve estimation accuracy and enhance reliability and robustness against attacks. From the perspective of the defender, the secure distributed Kalman fusion estimation problem is investigated in this paper for a class of CPSs under replay attacks, where each local estimate obtained by the sink node is transmitted to a remote fusion center through bandwidth constrained communication channels. A new mathematical model with compensation strategy is proposed to characterize the replay attacks and bandwidth constrains, and then a recursive distributed Kalman fusion estimator (DKFE) is designed in the linear minimum variance sense. According to different communication frameworks, two classes of data compression and compensation algorithms are developed such that the DKFEs can achieve the desired performance. Several attack-dependent and bandwidth-dependent conditions are derived such that the DKFEs are secure under replay attacks. An illustrative example is given to demonstrate the effectiveness of the proposed methods.
Off-line tracking of series parameters in distribution systems using AMI data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Tess L.; Sun, Yannan; Schneider, Kevin
2016-05-01
Electric distribution systems have historically lacked measurement points, and equipment is often operated to its failure point, resulting in customer outages. The widespread deployment of sensors at the distribution level is enabling observability. This paper presents an off-line parameter value tracking procedure that takes advantage of the increasing number of measurement devices being deployed at the distribution level to estimate changes in series impedance parameter values over time. The tracking of parameter values enables non-diurnal and non-seasonal change to be flagged for investigation. The presented method uses an unbalanced Distribution System State Estimation (DSSE) and a measurement residual-based parameter estimationmore » procedure. Measurement residuals from multiple measurement snapshots are combined in order to increase the effective local redundancy and improve the robustness of the calculations in the presence of measurement noise. Data from devices on the primary distribution system and from customer meters, via an AMI system, form the input data set. Results of simulations on the IEEE 13-Node Test Feeder are presented to illustrate the proposed approach applied to changes in series impedance parameters. A 5% change in series resistance elements can be detected in the presence of 2% measurement error when combining less than 1 day of measurement snapshots into a single estimate.« less
Enhancing Data Assimilation by Evolutionary Particle Filter and Markov Chain Monte Carlo
NASA Astrophysics Data System (ADS)
Moradkhani, H.; Abbaszadeh, P.; Yan, H.
2016-12-01
Particle Filters (PFs) have received increasing attention by the researchers from different disciplines in hydro-geosciences as an effective method to improve model predictions in nonlinear and non-Gaussian dynamical systems. The implication of dual state and parameter estimation by means of data assimilation in hydrology and geoscience has evolved since 2005 from SIR-PF to PF-MCMC and now to the most effective and robust framework through evolutionary PF approach based on Genetic Algorithm (GA) and Markov Chain Monte Carlo (MCMC), the so-called EPF-MCMC. In this framework, the posterior distribution undergoes an evolutionary process to update an ensemble of prior states that more closely resemble realistic posterior probability distribution. The premise of this approach is that the particles move to optimal position using the GA optimization coupled with MCMC increasing the number of effective particles, hence the particle degeneracy is avoided while the particle diversity is improved. The proposed algorithm is applied on a conceptual and highly nonlinear hydrologic model and the effectiveness, robustness and reliability of the method in jointly estimating the states and parameters and also reducing the uncertainty is demonstrated for few river basins across the United States.
Temporal rainfall estimation using input data reduction and model inversion
NASA Astrophysics Data System (ADS)
Wright, A. J.; Vrugt, J. A.; Walker, J. P.; Pauwels, V. R. N.
2016-12-01
Floods are devastating natural hazards. To provide accurate, precise and timely flood forecasts there is a need to understand the uncertainties associated with temporal rainfall and model parameters. The estimation of temporal rainfall and model parameter distributions from streamflow observations in complex dynamic catchments adds skill to current areal rainfall estimation methods, allows for the uncertainty of rainfall input to be considered when estimating model parameters and provides the ability to estimate rainfall from poorly gauged catchments. Current methods to estimate temporal rainfall distributions from streamflow are unable to adequately explain and invert complex non-linear hydrologic systems. This study uses the Discrete Wavelet Transform (DWT) to reduce rainfall dimensionality for the catchment of Warwick, Queensland, Australia. The reduction of rainfall to DWT coefficients allows the input rainfall time series to be simultaneously estimated along with model parameters. The estimation process is conducted using multi-chain Markov chain Monte Carlo simulation with the DREAMZS algorithm. The use of a likelihood function that considers both rainfall and streamflow error allows for model parameter and temporal rainfall distributions to be estimated. Estimation of the wavelet approximation coefficients of lower order decomposition structures was able to estimate the most realistic temporal rainfall distributions. These rainfall estimates were all able to simulate streamflow that was superior to the results of a traditional calibration approach. It is shown that the choice of wavelet has a considerable impact on the robustness of the inversion. The results demonstrate that streamflow data contains sufficient information to estimate temporal rainfall and model parameter distributions. The extent and variance of rainfall time series that are able to simulate streamflow that is superior to that simulated by a traditional calibration approach is a demonstration of equifinality. The use of a likelihood function that considers both rainfall and streamflow error combined with the use of the DWT as a model data reduction technique allows the joint inference of hydrologic model parameters along with rainfall.
Monitoring TASCC Injections Using A Field-Ready Wet Chemistry Nutrient Autoanalyzer
NASA Astrophysics Data System (ADS)
Snyder, L. E.; Herstand, M. R.; Bowden, W. B.
2011-12-01
Quantification of nutrient cycling and transport (spiraling) in stream systems is a fundamental component of stream ecology. Additions of isotopic tracer and bulk inorganic nutrient to streams have been frequently used to evaluate nutrient transfer between ecosystem compartments and nutrient uptake estimation, respectively. The Tracer Addition for Spiraling Curve Characterization (TASCC) methodology of Covino et al. (2010) instantaneously and simultaneously adds conservative and biologically active tracers to a stream system to quantify nutrient uptake metrics. In this method, comparing the ratio of mass of nutrient and conservative solute recovered in each sample throughout a breakthrough curve to that of the injectate, a distribution of spiraling metrics is calculated across a range of nutrient concentrations. This distribution across concentrations allows for both a robust estimation of ambient spiraling parameters by regression techniques, and comparison with uptake kinetic models. We tested a unique sampling strategy for TASCC injections in which samples were taken manually throughout the nutrient breakthrough curves while, simultaneously, continuously monitoring with a field-ready wet chemistry autoanalyzer. The autoanalyzer was programmed to measure concentrations of nitrate, phosphate and ammonium at the rate of one measurement per second throughout each experiment. Utilization of an autoanalyzer in the field during the experiment results in the return of several thousand additional nutrient data points when compared with manual sampling. This technique, then, allows for a deeper understanding and more statistically robust estimation of stream nutrient spiraling parameters.
Benjamini, Dan; Basser, Peter J
2014-12-07
In this work, we present an experimental design and analytical framework to measure the nonparametric joint radius-length (R-L) distribution of an ensemble of parallel, finite cylindrical pores, and more generally, the eccentricity distribution of anisotropic pores. Employing a novel 3D double pulsed-field gradient acquisition scheme, we first obtain both the marginal radius and length distributions of a population of cylindrical pores and then use these to constrain and stabilize the estimate of the joint radius-length distribution. Using the marginal distributions as constraints allows the joint R-L distribution to be reconstructed from an underdetermined system (i.e., more variables than equations), which requires a relatively small and feasible number of MR acquisitions. Three simulated representative joint R-L distribution phantoms corrupted by different noise levels were reconstructed to demonstrate the process, using this new framework. As expected, the broader the peaks in the joint distribution, the less stable and more sensitive to noise the estimation of the marginal distributions. Nevertheless, the reconstruction of the joint distribution is remarkably robust to increases in noise level; we attribute this characteristic to the use of the marginal distributions as constraints. Axons are known to exhibit local compartment eccentricity variations upon injury; the extent of the variations depends on the severity of the injury. Nonparametric estimation of the eccentricity distribution of injured axonal tissue is of particular interest since generally one cannot assume a parametric distribution a priori. Reconstructing the eccentricity distribution may provide vital information about changes resulting from injury or that occurred during development.
Robustness of Reconstructed Ancestral Protein Functions to Statistical Uncertainty.
Eick, Geeta N; Bridgham, Jamie T; Anderson, Douglas P; Harms, Michael J; Thornton, Joseph W
2017-02-01
Hypotheses about the functions of ancient proteins and the effects of historical mutations on them are often tested using ancestral protein reconstruction (APR)-phylogenetic inference of ancestral sequences followed by synthesis and experimental characterization. Usually, some sequence sites are ambiguously reconstructed, with two or more statistically plausible states. The extent to which the inferred functions and mutational effects are robust to uncertainty about the ancestral sequence has not been studied systematically. To address this issue, we reconstructed ancestral proteins in three domain families that have different functions, architectures, and degrees of uncertainty; we then experimentally characterized the functional robustness of these proteins when uncertainty was incorporated using several approaches, including sampling amino acid states from the posterior distribution at each site and incorporating the alternative amino acid state at every ambiguous site in the sequence into a single "worst plausible case" protein. In every case, qualitative conclusions about the ancestral proteins' functions and the effects of key historical mutations were robust to sequence uncertainty, with similar functions observed even when scores of alternate amino acids were incorporated. There was some variation in quantitative descriptors of function among plausible sequences, suggesting that experimentally characterizing robustness is particularly important when quantitative estimates of ancient biochemical parameters are desired. The worst plausible case method appears to provide an efficient strategy for characterizing the functional robustness of ancestral proteins to large amounts of sequence uncertainty. Sampling from the posterior distribution sometimes produced artifactually nonfunctional proteins for sequences reconstructed with substantial ambiguity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Lo, Kenneth
2011-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375
Lo, Kenneth; Gottardo, Raphael
2012-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.
Robust estimation for partially linear models with large-dimensional covariates
Zhu, LiPing; Li, RunZe; Cui, HengJian
2014-01-01
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087
Robust estimation for partially linear models with large-dimensional covariates.
Zhu, LiPing; Li, RunZe; Cui, HengJian
2013-10-01
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.
Leão, William L.; Chen, Ming-Hui
2017-01-01
A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor’s 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model. PMID:29333210
Leão, William L; Abanto-Valle, Carlos A; Chen, Ming-Hui
2017-01-01
A stochastic volatility-in-mean model with correlated errors using the generalized hyperbolic skew Student-t (GHST) distribution provides a robust alternative to the parameter estimation for daily stock returns in the absence of normality. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for parameter estimation. The deviance information, the Bayesian predictive information and the log-predictive score criterion are used to assess the fit of the proposed model. The proposed method is applied to an analysis of the daily stock return data from the Standard & Poor's 500 index (S&P 500). The empirical results reveal that the stochastic volatility-in-mean model with correlated errors and GH-ST distribution leads to a significant improvement in the goodness-of-fit for the S&P 500 index returns dataset over the usual normal model.
Working covariance model selection for generalized estimating equations.
Carey, Vincent J; Wang, You-Gan
2011-11-20
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice. Copyright © 2011 John Wiley & Sons, Ltd.
Robust time and frequency domain estimation methods in adaptive control
NASA Technical Reports Server (NTRS)
Lamaire, Richard Orville
1987-01-01
A robust identification method was developed for use in an adaptive control system. The type of estimator is called the robust estimator, since it is robust to the effects of both unmodeled dynamics and an unmeasurable disturbance. The development of the robust estimator was motivated by a need to provide guarantees in the identification part of an adaptive controller. To enable the design of a robust control system, a nominal model as well as a frequency-domain bounding function on the modeling uncertainty associated with this nominal model must be provided. Two estimation methods are presented for finding parameter estimates, and, hence, a nominal model. One of these methods is based on the well developed field of time-domain parameter estimation. In a second method of finding parameter estimates, a type of weighted least-squares fitting to a frequency-domain estimated model is used. The frequency-domain estimator is shown to perform better, in general, than the time-domain parameter estimator. In addition, a methodology for finding a frequency-domain bounding function on the disturbance is used to compute a frequency-domain bounding function on the additive modeling error due to the effects of the disturbance and the use of finite-length data. The performance of the robust estimator in both open-loop and closed-loop situations is examined through the use of simulations.
Exploring super-Gaussianity toward robust information-theoretical time delay estimation.
Petsatodis, Theodoros; Talantzis, Fotios; Boukis, Christos; Tan, Zheng-Hua; Prasad, Ramjee
2013-03-01
Time delay estimation (TDE) is a fundamental component of speaker localization and tracking algorithms. Most of the existing systems are based on the generalized cross-correlation method assuming gaussianity of the source. It has been shown that the distribution of speech, captured with far-field microphones, is highly varying, depending on the noise and reverberation conditions. Thus the performance of TDE is expected to fluctuate depending on the underlying assumption for the speech distribution, being also subject to multi-path reflections and competitive background noise. This paper investigates the effect upon TDE when modeling the source signal with different speech-based distributions. An information theoretical TDE method indirectly encapsulating higher order statistics (HOS) formed the basis of this work. The underlying assumption of Gaussian distributed source has been replaced by that of generalized Gaussian distribution that allows evaluating the problem under a larger set of speech-shaped distributions, ranging from Gaussian to Laplacian and Gamma. Closed forms of the univariate and multivariate entropy expressions of the generalized Gaussian distribution are derived to evaluate the TDE. The results indicate that TDE based on the specific criterion is independent of the underlying assumption for the distribution of the source, for the same covariance matrix.
Inferring the distribution of mutational effects on fitness in Drosophila.
Loewe, Laurence; Charlesworth, Brian
2006-09-22
The properties of the distribution of deleterious mutational effects on fitness (DDME) are of fundamental importance for evolutionary genetics. Since it is extremely difficult to determine the nature of this distribution, several methods using various assumptions about the DDME have been developed, for the purpose of parameter estimation. We apply a newly developed method to DNA sequence polymorphism data from two Drosophila species and compare estimates of the parameters of the distribution of the heterozygous fitness effects of amino acid mutations for several different distribution functions. The results exclude normal and gamma distributions, since these predict too few effectively lethal mutations and power-law distributions as a result of predicting too many lethals. Only the lognormal distribution appears to fit both the diversity data and the frequency of lethals. This DDME arises naturally in complex systems when independent factors contribute multiplicatively to an increase in fitness-reducing damage. Several important parameters, such as the fraction of effectively neutral non-synonymous mutations and the harmonic mean of non-neutral selection coefficients, are robust to the form of the DDME. Our results suggest that the majority of non-synonymous mutations in Drosophila are under effective purifying selection.
Robust inference in discrete hazard models for randomized clinical trials.
Nguyen, Vinh Q; Gillen, Daniel L
2012-10-01
Time-to-event data in which failures are only assessed at discrete time points are common in many clinical trials. Examples include oncology studies where events are observed through periodic screenings such as radiographic scans. When the survival endpoint is acknowledged to be discrete, common methods for the analysis of observed failure times include the discrete hazard models (e.g., the discrete-time proportional hazards and the continuation ratio model) and the proportional odds model. In this manuscript, we consider estimation of a marginal treatment effect in discrete hazard models where the constant treatment effect assumption is violated. We demonstrate that the estimator resulting from these discrete hazard models is consistent for a parameter that depends on the underlying censoring distribution. An estimator that removes the dependence on the censoring mechanism is proposed and its asymptotic distribution is derived. Basing inference on the proposed estimator allows for statistical inference that is scientifically meaningful and reproducible. Simulation is used to assess the performance of the presented methodology in finite samples.
3D beam shape estimation based on distributed coaxial cable interferometric sensor
NASA Astrophysics Data System (ADS)
Cheng, Baokai; Zhu, Wenge; Liu, Jie; Yuan, Lei; Xiao, Hai
2017-03-01
We present a coaxial cable interferometer based distributed sensing system for 3D beam shape estimation. By making a series of reflectors on a coaxial cable, multiple Fabry-Perot cavities are created on it. Two cables are mounted on the beam at proper locations, and a vector network analyzer (VNA) is connected to them to obtain the complex reflection signal, which is used to calculate the strain distribution of the beam in horizontal and vertical planes. With 6 GHz swept bandwidth on the VNA, the spatial resolution for distributed strain measurement is 0.1 m, and the sensitivity is 3.768 MHz mɛ -1 at the interferogram dip near 3.3 GHz. Using displacement-strain transformation, the shape of the beam is reconstructed. With only two modified cables and a VNA, this system is easy to implement and manage. Comparing to optical fiber based sensor systems, the coaxial cable sensors have the advantage of large strain and robustness, making this system suitable for structure health monitoring applications.
NASA Astrophysics Data System (ADS)
Zhang, Langwen; Xie, Wei; Wang, Jingcheng
2017-11-01
In this work, synthesis of robust distributed model predictive control (MPC) is presented for a class of linear systems subject to structured time-varying uncertainties. By decomposing a global system into smaller dimensional subsystems, a set of distributed MPC controllers, instead of a centralised controller, are designed. To ensure the robust stability of the closed-loop system with respect to model uncertainties, distributed state feedback laws are obtained by solving a min-max optimisation problem. The design of robust distributed MPC is then transformed into solving a minimisation optimisation problem with linear matrix inequality constraints. An iterative online algorithm with adjustable maximum iteration is proposed to coordinate the distributed controllers to achieve a global performance. The simulation results show the effectiveness of the proposed robust distributed MPC algorithm.
Doubly robust nonparametric inference on the average treatment effect.
Benkeser, D; Carone, M; Laan, M J Van Der; Gilbert, P B
2017-12-01
Doubly robust estimators are widely used to draw inference about the average effect of a treatment. Such estimators are consistent for the effect of interest if either one of two nuisance parameters is consistently estimated. However, if flexible, data-adaptive estimators of these nuisance parameters are used, double robustness does not readily extend to inference. We present a general theoretical study of the behaviour of doubly robust estimators of an average treatment effect when one of the nuisance parameters is inconsistently estimated. We contrast different methods for constructing such estimators and investigate the extent to which they may be modified to also allow doubly robust inference. We find that while targeted minimum loss-based estimation can be used to solve this problem very naturally, common alternative frameworks appear to be inappropriate for this purpose. We provide a theoretical study and a numerical evaluation of the alternatives considered. Our simulations highlight the need for and usefulness of these approaches in practice, while our theoretical developments have broad implications for the construction of estimators that permit doubly robust inference in other problems.
NASA Astrophysics Data System (ADS)
Olsen, S.; Zaliapin, I.
2008-12-01
We establish positive correlation between the local spatio-temporal fluctuations of the earthquake magnitude distribution and the occurrence of regional earthquakes. In order to accomplish this goal, we develop a sequential Bayesian statistical estimation framework for the b-value (slope of the Gutenberg-Richter's exponential approximation to the observed magnitude distribution) and for the ratio a(t) between the earthquake intensities in two non-overlapping magnitude intervals. The time-dependent dynamics of these parameters is analyzed using Markov Chain Models (MCM). The main advantage of this approach over the traditional window-based estimation is its "soft" parameterization, which allows one to obtain stable results with realistically small samples. We furthermore discuss a statistical methodology for establishing lagged correlations between continuous and point processes. The developed methods are applied to the observed seismicity of California, Nevada, and Japan on different temporal and spatial scales. We report an oscillatory dynamics of the estimated parameters, and find that the detected oscillations are positively correlated with the occurrence of large regional earthquakes, as well as with small events with magnitudes as low as 2.5. The reported results have important implications for further development of earthquake prediction and seismic hazard assessment methods.
A generalized gamma mixture model for ultrasonic tissue characterization.
Vegas-Sanchez-Ferrero, Gonzalo; Aja-Fernandez, Santiago; Palencia, Cesar; Martin-Fernandez, Marcos
2012-01-01
Several statistical models have been proposed in the literature to describe the behavior of speckles. Among them, the Nakagami distribution has proven to very accurately characterize the speckle behavior in tissues. However, it fails when describing the heavier tails caused by the impulsive response of a speckle. The Generalized Gamma (GG) distribution (which also generalizes the Nakagami distribution) was proposed to overcome these limitations. Despite the advantages of the distribution in terms of goodness of fitting, its main drawback is the lack of a closed-form maximum likelihood (ML) estimates. Thus, the calculation of its parameters becomes difficult and not attractive. In this work, we propose (1) a simple but robust methodology to estimate the ML parameters of GG distributions and (2) a Generalized Gama Mixture Model (GGMM). These mixture models are of great value in ultrasound imaging when the received signal is characterized by a different nature of tissues. We show that a better speckle characterization is achieved when using GG and GGMM rather than other state-of-the-art distributions and mixture models. Results showed the better performance of the GG distribution in characterizing the speckle of blood and myocardial tissue in ultrasonic images.
A Generalized Gamma Mixture Model for Ultrasonic Tissue Characterization
Palencia, Cesar; Martin-Fernandez, Marcos
2012-01-01
Several statistical models have been proposed in the literature to describe the behavior of speckles. Among them, the Nakagami distribution has proven to very accurately characterize the speckle behavior in tissues. However, it fails when describing the heavier tails caused by the impulsive response of a speckle. The Generalized Gamma (GG) distribution (which also generalizes the Nakagami distribution) was proposed to overcome these limitations. Despite the advantages of the distribution in terms of goodness of fitting, its main drawback is the lack of a closed-form maximum likelihood (ML) estimates. Thus, the calculation of its parameters becomes difficult and not attractive. In this work, we propose (1) a simple but robust methodology to estimate the ML parameters of GG distributions and (2) a Generalized Gama Mixture Model (GGMM). These mixture models are of great value in ultrasound imaging when the received signal is characterized by a different nature of tissues. We show that a better speckle characterization is achieved when using GG and GGMM rather than other state-of-the-art distributions and mixture models. Results showed the better performance of the GG distribution in characterizing the speckle of blood and myocardial tissue in ultrasonic images. PMID:23424602
Davies, John R; Chang, Yu-mei; Bishop, D Timothy; Armstrong, Bruce K; Bataille, Veronique; Bergman, Wilma; Berwick, Marianne; Bracci, Paige M; Elwood, J Mark; Ernstoff, Marc S; Green, Adele; Gruis, Nelleke A; Holly, Elizabeth A; Ingvar, Christian; Kanetsky, Peter A; Karagas, Margaret R; Lee, Tim K; Le Marchand, Loïc; Mackie, Rona M; Olsson, Håkan; Østerlind, Anne; Rebbeck, Timothy R; Reich, Kristian; Sasieni, Peter; Siskind, Victor; Swerdlow, Anthony J; Titus, Linda; Zens, Michael S; Ziegler, Andreas; Gallagher, Richard P.; Barrett, Jennifer H; Newton-Bishop, Julia
2015-01-01
Background We report the development of a cutaneous melanoma risk algorithm based upon 7 factors; hair colour, skin type, family history, freckling, nevus count, number of large nevi and history of sunburn, intended to form the basis of a self-assessment webtool for the general public. Methods Predicted odds of melanoma were estimated by analysing a pooled dataset from 16 case-control studies using logistic random coefficients models. Risk categories were defined based on the distribution of the predicted odds in the controls from these studies. Imputation was used to estimate missing data in the pooled datasets. The 30th, 60th and 90th centiles were used to distribute individuals into four risk groups for their age, sex and geographic location. Cross-validation was used to test the robustness of the thresholds for each group by leaving out each study one by one. Performance of the model was assessed in an independent UK case-control study dataset. Results Cross-validation confirmed the robustness of the threshold estimates. Cases and controls were well discriminated in the independent dataset (area under the curve 0.75, 95% CI 0.73-0.78). 29% of cases were in the highest risk group compared with 7% of controls, and 43% of controls were in the lowest risk group compared with 13% of cases. Conclusion We have identified a composite score representing an estimate of relative risk and successfully validated this score in an independent dataset. Impact This score may be a useful tool to inform members of the public about their melanoma risk. PMID:25713022
NASA Astrophysics Data System (ADS)
Christen, Alejandra; Escarate, Pedro; Curé, Michel; Rial, Diego F.; Cassetti, Julia
2016-10-01
Aims: Knowing the distribution of stellar rotational velocities is essential for understanding stellar evolution. Because we measure the projected rotational speed v sin I, we need to solve an ill-posed problem given by a Fredholm integral of the first kind to recover the "true" rotational velocity distribution. Methods: After discretization of the Fredholm integral we apply the Tikhonov regularization method to obtain directly the probability distribution function for stellar rotational velocities. We propose a simple and straightforward procedure to determine the Tikhonov parameter. We applied Monte Carlo simulations to prove that the Tikhonov method is a consistent estimator and asymptotically unbiased. Results: This method is applied to a sample of cluster stars. We obtain confidence intervals using a bootstrap method. Our results are in close agreement with those obtained using the Lucy method for recovering the probability density distribution of rotational velocities. Furthermore, Lucy estimation lies inside our confidence interval. Conclusions: Tikhonov regularization is a highly robust method that deconvolves the rotational velocity probability density function from a sample of v sin I data directly without the need for any convergence criteria.
Estimation of blade airloads from rotor blade bending moments
NASA Technical Reports Server (NTRS)
Bousman, William G.
1987-01-01
This paper presents a method for the estimation of blade airloads, based on the measurements of flap bending moments. In this procedure, the blade rotation in vacuum modes is calculated, and the airloads are expressed as an algebraic sum of the mode shapes, modal amplitudes, mass distribution, and frequency properties. The method was validated by comparing the calculated airload distribution with the original wind tunnel measurements which were made using ten modes and twenty measurement stations. Good agreement between the predicted and the measured airloads was found up to 0.90 R, but the agreement degraded towards the blade tip. The method is shown to be quite robust to the type of experimental problems that could be expected to occur in the testing of full-scale and model-scale rotors.
A lower limit on the age of the universe
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chaboyer, B.; Demarque, P.; Kernan, P.J.
1996-02-16
A detailed numerical study was designed and conducted to estimate the absolute age and the uncertainty in age (with confidence limits) of the oldest globular clusters in our galaxy, and hence to put a robust lower bound on the age of the universe. Estimates of the uncertainty range and distribution in the input parameters of stellar evolution codes were used to produce 1000 Monte Carlo realizations of stellar isochrones, which were then used to derive ages for the 17 oldest globular clusters. A probability distribution for the mean age of these systems was derived by incorporating the observational uncertainties inmore » chrones. The dominant contribution to the width of the distribution (approximately {sup +}{sub -}5) magnitudes. Subdominant contributions came from the choice of the color table used to translate theoretical luminosities and temperatures to observed magnitudes and colors, as well as from theoretical uncertainties in heavy element abundances and mixing length. The one-sided 95 percent confidence limit lower bound for this distribution occurs at an age of 12.07 X 10{sup 9} years, and the median age for the distribution is 14.56 X 10{sup 9} years. These age limits, when compared with the Hubble age estimate, put powerful constraints on cosmology. 41 refs., 2 figs.« less
Tanner-Smith, Emily E; Tipton, Elizabeth
2014-03-01
Methodologists have recently proposed robust variance estimation as one way to handle dependent effect sizes in meta-analysis. Software macros for robust variance estimation in meta-analysis are currently available for Stata (StataCorp LP, College Station, TX, USA) and spss (IBM, Armonk, NY, USA), yet there is little guidance for authors regarding the practical application and implementation of those macros. This paper provides a brief tutorial on the implementation of the Stata and spss macros and discusses practical issues meta-analysts should consider when estimating meta-regression models with robust variance estimates. Two example databases are used in the tutorial to illustrate the use of meta-analysis with robust variance estimates. Copyright © 2013 John Wiley & Sons, Ltd.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Scalable Robust Principal Component Analysis Using Grassmann Averages.
Hauberg, Sren; Feragen, Aasa; Enficiaud, Raffi; Black, Michael J
2016-11-01
In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average ( GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average ( TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.
Estimating Soil and Root Parameters of Biofuel Crops using a Hydrogeophysical Inversion
NASA Astrophysics Data System (ADS)
Kuhl, A.; Kendall, A. D.; Van Dam, R. L.; Hyndman, D. W.
2017-12-01
Transpiration is the dominant pathway for continental water exchange to the atmosphere, and therefore a crucial aspect of modeling water balances at many scales. The root water uptake dynamics that control transpiration are dependent on soil water availability, as well as the root distribution. However, the root distribution is determined by many factors beyond the plant species alone, including climate conditions and soil texture. Despite the significant contribution of transpiration to global water fluxes, modelling the complex critical zone processes that drive root water uptake remains a challenge. Geophysical tools such as electrical resistivity (ER), have been shown to be highly sensitive to water dynamics in the unsaturated zone. ER data can be temporally and spatially robust, covering large areas or long time periods non-invasively, which is an advantage over in-situ methods. Previous studies have shown the value of using hydrogeophysical inversions to estimate soil properties. Others have used hydrological inversions to estimate both soil properties and root distribution parameters. In this study, we combine these two approaches to create a coupled hydrogeophysical inversion that estimates root and retention curve parameters for a HYDRUS model. To test the feasibility of this new approach, we estimated daily water fluxes and root growth for several biofuel crops at a long-term ecological research site in Southwest Michigan, using monthly ER data from 2009 through 2011. Time domain reflectometry data at seven depths was used to validate modeled soil moisture estimates throughout the model period. This hydrogeophysical inversion method shows promise for improving root distribution and transpiration estimates across a wide variety of settings.
Robust estimation approach for blind denoising.
Rabie, Tamer
2005-11-01
This work develops a new robust statistical framework for blind image denoising. Robust statistics addresses the problem of estimation when the idealized assumptions about a system are occasionally violated. The contaminating noise in an image is considered as a violation of the assumption of spatial coherence of the image intensities and is treated as an outlier random variable. A denoised image is estimated by fitting a spatially coherent stationary image model to the available noisy data using a robust estimator-based regression method within an optimal-size adaptive window. The robust formulation aims at eliminating the noise outliers while preserving the edge structures in the restored image. Several examples demonstrating the effectiveness of this robust denoising technique are reported and a comparison with other standard denoising filters is presented.
Robust group-wise rigid registration of point sets using t-mixture model
NASA Astrophysics Data System (ADS)
Ravikumar, Nishant; Gooya, Ali; Frangi, Alejandro F.; Taylor, Zeike A.
2016-03-01
A probabilistic framework for robust, group-wise rigid alignment of point-sets using a mixture of Students t-distribution especially when the point sets are of varying lengths, are corrupted by an unknown degree of outliers or in the presence of missing data. Medical images (in particular magnetic resonance (MR) images), their segmentations and consequently point-sets generated from these are highly susceptible to corruption by outliers. This poses a problem for robust correspondence estimation and accurate alignment of shapes, necessary for training statistical shape models (SSMs). To address these issues, this study proposes to use a t-mixture model (TMM), to approximate the underlying joint probability density of a group of similar shapes and align them to a common reference frame. The heavy-tailed nature of t-distributions provides a more robust registration framework in comparison to state of the art algorithms. Significant reduction in alignment errors is achieved in the presence of outliers, using the proposed TMM-based group-wise rigid registration method, in comparison to its Gaussian mixture model (GMM) counterparts. The proposed TMM-framework is compared with a group-wise variant of the well-known Coherent Point Drift (CPD) algorithm and two other group-wise methods using GMMs, using both synthetic and real data sets. Rigid alignment errors for groups of shapes are quantified using the Hausdorff distance (HD) and quadratic surface distance (QSD) metrics.
A Weak Value Based QKD Protocol Robust Against Detector Attacks
NASA Astrophysics Data System (ADS)
Troupe, James
2015-03-01
We propose a variation of the BB84 quantum key distribution protocol that utilizes the properties of weak values to insure the validity of the quantum bit error rate estimates used to detect an eavesdropper. The protocol is shown theoretically to be secure against recently demonstrated attacks utilizing detector blinding and control and should also be robust against all detector based hacking. Importantly, the new protocol promises to achieve this additional security without negatively impacting the secure key generation rate as compared to that originally promised by the standard BB84 scheme. Implementation of the weak measurements needed by the protocol should be very feasible using standard quantum optical techniques.
Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.
Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less
Robustness of methods for blinded sample size re-estimation with overdispersed count data.
Schneider, Simon; Schmidli, Heinz; Friede, Tim
2013-09-20
Counts of events are increasingly common as primary endpoints in randomized clinical trials. With between-patient heterogeneity leading to variances in excess of the mean (referred to as overdispersion), statistical models reflecting this heterogeneity by mixtures of Poisson distributions are frequently employed. Sample size calculation in the planning of such trials requires knowledge on the nuisance parameters, that is, the control (or overall) event rate and the overdispersion parameter. Usually, there is only little prior knowledge regarding these parameters in the design phase resulting in considerable uncertainty regarding the sample size. In this situation internal pilot studies have been found very useful and very recently several blinded procedures for sample size re-estimation have been proposed for overdispersed count data, one of which is based on an EM-algorithm. In this paper we investigate the EM-algorithm based procedure with respect to aspects of their implementation by studying the algorithm's dependence on the choice of convergence criterion and find that the procedure is sensitive to the choice of the stopping criterion in scenarios relevant to clinical practice. We also compare the EM-based procedure to other competing procedures regarding their operating characteristics such as sample size distribution and power. Furthermore, the robustness of these procedures to deviations from the model assumptions is explored. We find that some of the procedures are robust to at least moderate deviations. The results are illustrated using data from the US National Heart, Lung and Blood Institute sponsored Asymptomatic Cardiac Ischemia Pilot study. Copyright © 2013 John Wiley & Sons, Ltd.
Nearest neighbor density ratio estimation for large-scale applications in astronomy
NASA Astrophysics Data System (ADS)
Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.
2015-09-01
In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.
Similarity of Symbol Frequency Distributions with Heavy Tails
NASA Astrophysics Data System (ADS)
Gerlach, Martin; Font-Clos, Francesc; Altmann, Eduardo G.
2016-04-01
Quantifying the similarity between symbolic sequences is a traditional problem in information theory which requires comparing the frequencies of symbols in different sequences. In numerous modern applications, ranging from DNA over music to texts, the distribution of symbol frequencies is characterized by heavy-tailed distributions (e.g., Zipf's law). The large number of low-frequency symbols in these distributions poses major difficulties to the estimation of the similarity between sequences; e.g., they hinder an accurate finite-size estimation of entropies. Here, we show analytically how the systematic (bias) and statistical (fluctuations) errors in these estimations depend on the sample size N and on the exponent γ of the heavy-tailed distribution. Our results are valid for the Shannon entropy (α =1 ), its corresponding similarity measures (e.g., the Jensen-Shanon divergence), and also for measures based on the generalized entropy of order α . For small α 's, including α =1 , the errors decay slower than the 1 /N decay observed in short-tailed distributions. For α larger than a critical value α*=1 +1 /γ ≤2 , the 1 /N decay is recovered. We show the practical significance of our results by quantifying the evolution of the English language over the last two centuries using a complete α spectrum of measures. We find that frequent words change more slowly than less frequent words and that α =2 provides the most robust measure to quantify language change.
Observability and Estimation of Distributed Space Systems via Local Information-Exchange Networks
NASA Technical Reports Server (NTRS)
Fathpour, Nanaz; Hadaegh, Fred Y.; Mesbahi, Mehran; Rahmani, Amirreza
2011-01-01
Spacecraft formation flying involves the coordination of states among multiple spacecraft through relative sensing, inter-spacecraft communication, and control. Most existing formation-flying estimation algorithms can only be supported via highly centralized, all-to-all, static relative sensing. New algorithms are proposed that are scalable, modular, and robust to variations in the topology and link characteristics of the formation exchange network. These distributed algorithms rely on a local information exchange network, relaxing the assumptions on existing algorithms. Distributed space systems rely on a signal transmission network among multiple spacecraft for their operation. Control and coordination among multiple spacecraft in a formation is facilitated via a network of relative sensing and interspacecraft communications. Guidance, navigation, and control rely on the sensing network. This network becomes more complex the more spacecraft are added, or as mission requirements become more complex. The observability of a formation state was observed by a set of local observations from a particular node in the formation. Formation observability can be parameterized in terms of the matrices appearing in the formation dynamics and observation matrices. An agreement protocol was used as a mechanism for observing formation states from local measurements. An agreement protocol is essentially an unforced dynamic system whose trajectory is governed by the interconnection geometry and initial condition of each node, with a goal of reaching a common value of interest. The observability of the interconnected system depends on the geometry of the network, as well as the position of the observer relative to the topology. For the first time, critical GN&C (guidance, navigation, and control estimation) subsystems are synthesized by bringing the contribution of the spacecraft information-exchange network to the forefront of algorithmic analysis and design. The result is a formation estimation algorithm that is modular and robust to variations in the topology and link properties of the underlying formation network.
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
May, Peter; Garrido, Melissa M; Cassel, J Brian; Morrison, R Sean; Normand, Charles
2016-10-01
To evaluate the sensitivity of treatment effect estimates when length of stay (LOS) is used to control for unobserved heterogeneity when estimating treatment effect on cost of hospital admission with observational data. We used data from a prospective cohort study on the impact of palliative care consultation teams (PCCTs) on direct cost of hospital care. Adult patients with an advanced cancer diagnosis admitted to five large medical and cancer centers in the United States between 2007 and 2011 were eligible for this study. Costs were modeled using generalized linear models with a gamma distribution and a log link. We compared variability in estimates of PCCT impact on hospitalization costs when LOS was used as a covariate, as a sample parameter, and as an outcome denominator. We used propensity scores to account for patient characteristics associated with both PCCT use and total direct hospitalization costs. We analyzed data from hospital cost databases, medical records, and questionnaires. Our propensity score weighted sample included 969 patients who were discharged alive. In analyses of hospitalization costs, treatment effect estimates are highly sensitive to methods that control for LOS, complicating interpretation. Both the magnitude and significance of results varied widely with the method of controlling for LOS. When we incorporated intervention timing into our analyses, results were robust to LOS-controls. Treatment effect estimates using LOS-controls are not only suboptimal in terms of reliability (given concerns over endogeneity and bias) and usefulness (given the need to validate the cost-effectiveness of an intervention using overall resource use for a sample defined at baseline) but also in terms of robustness (results depend on the approach taken, and there is little evidence to guide this choice). To derive results that minimize endogeneity concerns and maximize external validity, investigators should match and analyze treatment and comparison arms on baseline factors only. Incorporating intervention timing may deliver results that are more reliable, more robust, and more useful than those derived using LOS-controls. © Health Research and Educational Trust.
Keshavan, J; Gremillion, G; Escobar-Alvarez, H; Humbert, J S
2014-06-01
Safe, autonomous navigation by aerial microsystems in less-structured environments is a difficult challenge to overcome with current technology. This paper presents a novel visual-navigation approach that combines bioinspired wide-field processing of optic flow information with control-theoretic tools for synthesis of closed loop systems, resulting in robustness and performance guarantees. Structured singular value analysis is used to synthesize a dynamic controller that provides good tracking performance in uncertain environments without resorting to explicit pose estimation or extraction of a detailed environmental depth map. Experimental results with a quadrotor demonstrate the vehicle's robust obstacle-avoidance behaviour in a straight line corridor, an S-shaped corridor and a corridor with obstacles distributed in the vehicle's path. The computational efficiency and simplicity of the current approach offers a promising alternative to satisfying the payload, power and bandwidth constraints imposed by aerial microsystems.
NASA Astrophysics Data System (ADS)
Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said
2014-09-01
In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.
NASA Astrophysics Data System (ADS)
Bandyopadhyay, Saptarshi
Multi-agent systems are widely used for constructing a desired formation shape, exploring an area, surveillance, coverage, and other cooperative tasks. This dissertation introduces novel algorithms in the three main areas of shape formation, distributed estimation, and attitude control of large-scale multi-agent systems. In the first part of this dissertation, we address the problem of shape formation for thousands to millions of agents. Here, we present two novel algorithms for guiding a large-scale swarm of robotic systems into a desired formation shape in a distributed and scalable manner. These probabilistic swarm guidance algorithms adopt an Eulerian framework, where the physical space is partitioned into bins and the swarm's density distribution over each bin is controlled using tunable Markov chains. In the first algorithm - Probabilistic Swarm Guidance using Inhomogeneous Markov Chains (PSG-IMC) - each agent determines its bin transition probabilities using a time-inhomogeneous Markov chain that is constructed in real-time using feedback from the current swarm distribution. This PSG-IMC algorithm minimizes the expected cost of the transitions required to achieve and maintain the desired formation shape, even when agents are added to or removed from the swarm. The algorithm scales well with a large number of agents and complex formation shapes, and can also be adapted for area exploration applications. In the second algorithm - Probabilistic Swarm Guidance using Optimal Transport (PSG-OT) - each agent determines its bin transition probabilities by solving an optimal transport problem, which is recast as a linear program. In the presence of perfect feedback of the current swarm distribution, this algorithm minimizes the given cost function, guarantees faster convergence, reduces the number of transitions for achieving the desired formation, and is robust to disturbances or damages to the formation. We demonstrate the effectiveness of these two proposed swarm guidance algorithms using results from numerical simulations and closed-loop hardware experiments on multiple quadrotors. In the second part of this dissertation, we present two novel discrete-time algorithms for distributed estimation, which track a single target using a network of heterogeneous sensing agents. The Distributed Bayesian Filtering (DBF) algorithm, the sensing agents combine their normalized likelihood functions using the logarithmic opinion pool and the discrete-time dynamic average consensus algorithm. Each agent's estimated likelihood function converges to an error ball centered on the joint likelihood function of the centralized multi-sensor Bayesian filtering algorithm. Using a new proof technique, the convergence, stability, and robustness properties of the DBF algorithm are rigorously characterized. The explicit bounds on the time step of the robust DBF algorithm are shown to depend on the time-scale of the target dynamics. Furthermore, the DBF algorithm for linear-Gaussian models can be cast into a modified form of the Kalman information filter. In the Bayesian Consensus Filtering (BCF) algorithm, the agents combine their estimated posterior pdfs multiple times within each time step using the logarithmic opinion pool scheme. Thus, each agent's consensual pdf minimizes the sum of Kullback-Leibler divergences with the local posterior pdfs. The performance and robust properties of these algorithms are validated using numerical simulations. In the third part of this dissertation, we present an attitude control strategy and a new nonlinear tracking controller for a spacecraft carrying a large object, such as an asteroid or a boulder. If the captured object is larger or comparable in size to the spacecraft and has significant modeling uncertainties, conventional nonlinear control laws that use exact feed-forward cancellation are not suitable because they exhibit a large resultant disturbance torque. The proposed nonlinear tracking control law guarantees global exponential convergence of tracking errors with finite-gain Lp stability in the presence of modeling uncertainties and disturbances, and reduces the resultant disturbance torque. Further, this control law permits the use of any attitude representation and its integral control formulation eliminates any constant disturbance. Under small uncertainties, the best strategy for stabilizing the combined system is to track a fuel-optimal reference trajectory using this nonlinear control law, because it consumes the least amount of fuel. In the presence of large uncertainties, the most effective strategy is to track the derivative plus proportional-derivative based reference trajectory, because it reduces the resultant disturbance torque. The effectiveness of the proposed attitude control law is demonstrated by using results of numerical simulation based on an Asteroid Redirect Mission concept. The new algorithms proposed in this dissertation will facilitate the development of versatile autonomous multi-agent systems that are capable of performing a variety of complex tasks in a robust and scalable manner.
NASA Astrophysics Data System (ADS)
Wright, Ashley J.; Walker, Jeffrey P.; Pauwels, Valentijn R. N.
2017-08-01
Floods are devastating natural hazards. To provide accurate, precise, and timely flood forecasts, there is a need to understand the uncertainties associated within an entire rainfall time series, even when rainfall was not observed. The estimation of an entire rainfall time series and model parameter distributions from streamflow observations in complex dynamic catchments adds skill to current areal rainfall estimation methods, allows for the uncertainty of entire rainfall input time series to be considered when estimating model parameters, and provides the ability to improve rainfall estimates from poorly gauged catchments. Current methods to estimate entire rainfall time series from streamflow records are unable to adequately invert complex nonlinear hydrologic systems. This study aims to explore the use of wavelets in the estimation of rainfall time series from streamflow records. Using the Discrete Wavelet Transform (DWT) to reduce rainfall dimensionality for the catchment of Warwick, Queensland, Australia, it is shown that model parameter distributions and an entire rainfall time series can be estimated. Including rainfall in the estimation process improves streamflow simulations by a factor of up to 1.78. This is achieved while estimating an entire rainfall time series, inclusive of days when none was observed. It is shown that the choice of wavelet can have a considerable impact on the robustness of the inversion. Combining the use of a likelihood function that considers rainfall and streamflow errors with the use of the DWT as a model data reduction technique allows the joint inference of hydrologic model parameters along with rainfall.
1982-06-01
observation in our framework is the pair (y,x) with x considered given. The influence function for 52 at the Gaussian distribution with mean xB and variance...3/2 - (1+22)o2 2) 1+2x\\/2 x’) 2(3-9) (1+2X) This influence function is bounded in the residual y-xS, and redescends to an asymptote greater than...version of the influence function for B at the Gaussian distribution, given the x. and x, is defined as the normalized differenceJ (see Barnett and
Neustifter, Benjamin; Rathbun, Stephen L; Shiffman, Saul
2012-01-01
Ecological Momentary Assessment is an emerging method of data collection in behavioral research that may be used to capture the times of repeated behavioral events on electronic devices, and information on subjects' psychological states through the electronic administration of questionnaires at times selected from a probability-based design as well as the event times. A method for fitting a mixed Poisson point process model is proposed for the impact of partially-observed, time-varying covariates on the timing of repeated behavioral events. A random frailty is included in the point-process intensity to describe variation among subjects in baseline rates of event occurrence. Covariate coefficients are estimated using estimating equations constructed by replacing the integrated intensity in the Poisson score equations with a design-unbiased estimator. An estimator is also proposed for the variance of the random frailties. Our estimators are robust in the sense that no model assumptions are made regarding the distribution of the time-varying covariates or the distribution of the random effects. However, subject effects are estimated under gamma frailties using an approximate hierarchical likelihood. The proposed approach is illustrated using smoking data.
A frequency-domain estimator for use in adaptive control systems
NASA Technical Reports Server (NTRS)
Lamaire, Richard O.; Valavani, Lena; Athans, Michael; Stein, Gunter
1991-01-01
This paper presents a frequency-domain estimator that can identify both a parametrized nominal model of a plant as well as a frequency-domain bounding function on the modeling error associated with this nominal model. This estimator, which we call a robust estimator, can be used in conjunction with a robust control-law redesign algorithm to form a robust adaptive controller.
Bilateral Trade Flows and Income Distribution Similarity.
Martínez-Zarzoso, Inmaculada; Vollmer, Sebastian
2016-01-01
Current models of bilateral trade neglect the effects of income distribution. This paper addresses the issue by accounting for non-homothetic consumer preferences and hence investigating the role of income distribution in the context of the gravity model of trade. A theoretically justified gravity model is estimated for disaggregated trade data (Dollar volume is used as dependent variable) using a sample of 104 exporters and 108 importers for 1980-2003 to achieve two main goals. We define and calculate new measures of income distribution similarity and empirically confirm that greater similarity of income distribution between countries implies more trade. Using distribution-based measures as a proxy for demand similarities in gravity models, we find consistent and robust support for the hypothesis that countries with more similar income-distributions trade more with each other. The hypothesis is also confirmed at disaggregated level for differentiated product categories.
Bilateral Trade Flows and Income Distribution Similarity
2016-01-01
Current models of bilateral trade neglect the effects of income distribution. This paper addresses the issue by accounting for non-homothetic consumer preferences and hence investigating the role of income distribution in the context of the gravity model of trade. A theoretically justified gravity model is estimated for disaggregated trade data (Dollar volume is used as dependent variable) using a sample of 104 exporters and 108 importers for 1980–2003 to achieve two main goals. We define and calculate new measures of income distribution similarity and empirically confirm that greater similarity of income distribution between countries implies more trade. Using distribution-based measures as a proxy for demand similarities in gravity models, we find consistent and robust support for the hypothesis that countries with more similar income-distributions trade more with each other. The hypothesis is also confirmed at disaggregated level for differentiated product categories. PMID:27137462
Robust Estimation of Mahalanobis Distances in Hyperspectral Images
2006-12-01
each method used to fit the MD distribution from the DFC ROI. No- tice how the F -mixture is affected by the last two data points (points most unlike...bottom spectra are the minimum and maximum in magnitude. Notice the decrease in variability compared to DFC and MCFC. For this ROI, the variability is...performance for DFC MD Data (ROI = 11,557 pix- els). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 6.5. Summary of performance
A statistical evaluation of non-ergodic variogram estimators
Curriero, F.C.; Hohn, M.E.; Liebhold, A.M.; Lele, S.R.
2002-01-01
Geostatistics is a set of statistical techniques that is increasingly used to characterize spatial dependence in spatially referenced ecological data. A common feature of geostatistics is predicting values at unsampled locations from nearby samples using the kriging algorithm. Modeling spatial dependence in sampled data is necessary before kriging and is usually accomplished with the variogram and its traditional estimator. Other types of estimators, known as non-ergodic estimators, have been used in ecological applications. Non-ergodic estimators were originally suggested as a method of choice when sampled data are preferentially located and exhibit a skewed frequency distribution. Preferentially located samples can occur, for example, when areas with high values are sampled more intensely than other areas. In earlier studies the visual appearance of variograms from traditional and non-ergodic estimators were compared. Here we evaluate the estimators' relative performance in prediction. We also show algebraically that a non-ergodic version of the variogram is equivalent to the traditional variogram estimator. Simulations, designed to investigate the effects of data skewness and preferential sampling on variogram estimation and kriging, showed the traditional variogram estimator outperforms the non-ergodic estimators under these conditions. We also analyzed data on carabid beetle abundance, which exhibited large-scale spatial variability (trend) and a skewed frequency distribution. Detrending data followed by robust estimation of the residual variogram is demonstrated to be a successful alternative to the non-ergodic approach.
Robust distributed control of spacecraft formation flying with adaptive network topology
NASA Astrophysics Data System (ADS)
Shasti, Behrouz; Alasty, Aria; Assadian, Nima
2017-07-01
In this study, the distributed six degree-of-freedom (6-DOF) coordinated control of spacecraft formation flying in low earth orbit (LEO) has been investigated. For this purpose, an accurate coupled translational and attitude relative dynamics model of the spacecraft with respect to the reference orbit (virtual leader) is presented by considering the most effective perturbation acceleration forces on LEO satellites, i.e. the second zonal harmonic and the atmospheric drag. Subsequently, the 6-DOF coordinated control of spacecraft in formation is studied. During the mission, the spacecraft communicate with each other through a switching network topology in which the weights of its graph Laplacian matrix change adaptively based on a distance-based connectivity function between neighboring agents. Because some of the dynamical system parameters such as spacecraft masses and moments of inertia may vary with time, an adaptive law is developed to estimate the parameter values during the mission. Furthermore, for the case that there is no knowledge of the unknown and time-varying parameters of the system, a robust controller has been developed. It is proved that the stability of the closed-loop system coupled with adaptation in network topology structure and optimality and robustness in control is guaranteed by the robust contraction analysis as an incremental stability method for multiple synchronized systems. The simulation results show the effectiveness of each control method in the presence of uncertainties and parameter variations. The adaptive and robust controllers show their superiority in reducing the state error integral as well as decreasing the control effort and settling time.
How social information can improve estimation accuracy in human groups.
Jayles, Bertrand; Kim, Hye-Rin; Escobedo, Ramón; Cezera, Stéphane; Blanchet, Adrien; Kameda, Tatsuya; Sire, Clément; Theraulaz, Guy
2017-11-21
In our digital and connected societies, the development of social networks, online shopping, and reputation systems raises the questions of how individuals use social information and how it affects their decisions. We report experiments performed in France and Japan, in which subjects could update their estimates after having received information from other subjects. We measure and model the impact of this social information at individual and collective scales. We observe and justify that, when individuals have little prior knowledge about a quantity, the distribution of the logarithm of their estimates is close to a Cauchy distribution. We find that social influence helps the group improve its properly defined collective accuracy. We quantify the improvement of the group estimation when additional controlled and reliable information is provided, unbeknownst to the subjects. We show that subjects' sensitivity to social influence permits us to define five robust behavioral traits and increases with the difference between personal and group estimates. We then use our data to build and calibrate a model of collective estimation to analyze the impact on the group performance of the quantity and quality of information received by individuals. The model quantitatively reproduces the distributions of estimates and the improvement of collective performance and accuracy observed in our experiments. Finally, our model predicts that providing a moderate amount of incorrect information to individuals can counterbalance the human cognitive bias to systematically underestimate quantities and thereby improve collective performance. Copyright © 2017 the Author(s). Published by PNAS.
How social information can improve estimation accuracy in human groups
Jayles, Bertrand; Kim, Hye-rin; Cezera, Stéphane; Blanchet, Adrien; Kameda, Tatsuya; Sire, Clément; Theraulaz, Guy
2017-01-01
In our digital and connected societies, the development of social networks, online shopping, and reputation systems raises the questions of how individuals use social information and how it affects their decisions. We report experiments performed in France and Japan, in which subjects could update their estimates after having received information from other subjects. We measure and model the impact of this social information at individual and collective scales. We observe and justify that, when individuals have little prior knowledge about a quantity, the distribution of the logarithm of their estimates is close to a Cauchy distribution. We find that social influence helps the group improve its properly defined collective accuracy. We quantify the improvement of the group estimation when additional controlled and reliable information is provided, unbeknownst to the subjects. We show that subjects’ sensitivity to social influence permits us to define five robust behavioral traits and increases with the difference between personal and group estimates. We then use our data to build and calibrate a model of collective estimation to analyze the impact on the group performance of the quantity and quality of information received by individuals. The model quantitatively reproduces the distributions of estimates and the improvement of collective performance and accuracy observed in our experiments. Finally, our model predicts that providing a moderate amount of incorrect information to individuals can counterbalance the human cognitive bias to systematically underestimate quantities and thereby improve collective performance. PMID:29118142
Robust linear discriminant models to solve financial crisis in banking sectors
NASA Astrophysics Data System (ADS)
Lim, Yai-Fung; Yahaya, Sharipah Soaad Syed; Idris, Faoziah; Ali, Hazlina; Omar, Zurni
2014-12-01
Linear discriminant analysis (LDA) is a widely-used technique in patterns classification via an equation which will minimize the probability of misclassifying cases into their respective categories. However, the performance of classical estimators in LDA highly depends on the assumptions of normality and homoscedasticity. Several robust estimators in LDA such as Minimum Covariance Determinant (MCD), S-estimators and Minimum Volume Ellipsoid (MVE) are addressed by many authors to alleviate the problem of non-robustness of the classical estimates. In this paper, we investigate on the financial crisis of the Malaysian banking institutions using robust LDA and classical LDA methods. Our objective is to distinguish the "distress" and "non-distress" banks in Malaysia by using the LDA models. Hit ratio is used to validate the accuracy predictive of LDA models. The performance of LDA is evaluated by estimating the misclassification rate via apparent error rate. The results and comparisons show that the robust estimators provide a better performance than the classical estimators for LDA.
Robust Fault Detection Using Robust Z1 Estimation and Fuzzy Logic
NASA Technical Reports Server (NTRS)
Curry, Tramone; Collins, Emmanuel G., Jr.; Selekwa, Majura; Guo, Ten-Huei (Technical Monitor)
2001-01-01
This research considers the application of robust Z(sub 1), estimation in conjunction with fuzzy logic to robust fault detection for an aircraft fight control system. It begins with the development of robust Z(sub 1) estimators based on multiplier theory and then develops a fixed threshold approach to fault detection (FD). It then considers the use of fuzzy logic for robust residual evaluation and FD. Due to modeling errors and unmeasurable disturbances, it is difficult to distinguish between the effects of an actual fault and those caused by uncertainty and disturbance. Hence, it is the aim of a robust FD system to be sensitive to faults while remaining insensitive to uncertainty and disturbances. While fixed thresholds only allow a decision on whether a fault has or has not occurred, it is more valuable to have the residual evaluation lead to a conclusion related to the degree of, or probability of, a fault. Fuzzy logic is a viable means of determining the degree of a fault and allows the introduction of human observations that may not be incorporated in the rigorous threshold theory. Hence, fuzzy logic can provide a more reliable and informative fault detection process. Using an aircraft flight control system, the results of FD using robust Z(sub 1) estimation with a fixed threshold are demonstrated. FD that combines robust Z(sub 1) estimation and fuzzy logic is also demonstrated. It is seen that combining the robust estimator with fuzzy logic proves to be advantageous in increasing the sensitivity to smaller faults while remaining insensitive to uncertainty and disturbances.
Estimating cost ratio distribution between fatal and non-fatal road accidents in Malaysia
NASA Astrophysics Data System (ADS)
Hamdan, Nurhidayah; Daud, Noorizam
2014-07-01
Road traffic crashes are a global major problem, and should be treated as a shared responsibility. In Malaysia, road accident tragedies kill 6,917 people and injure or disable 17,522 people in year 2012, and government spent about RM9.3 billion in 2009 which cost the nation approximately 1 to 2 percent loss of gross domestic product (GDP) reported annually. The current cost ratio for fatal and non-fatal accident used by Ministry of Works Malaysia simply based on arbitrary value of 6:4 or equivalent 1.5:1 depends on the fact that there are six factors involved in the calculation accident cost for fatal accident while four factors for non-fatal accident. The simple indication used by the authority to calculate the cost ratio is doubted since there is lack of mathematical and conceptual evidence to explain how this ratio is determined. The main aim of this study is to determine the new accident cost ratio for fatal and non-fatal accident in Malaysia based on quantitative statistical approach. The cost ratio distributions will be estimated based on Weibull distribution. Due to the unavailability of official accident cost data, insurance claim data both for fatal and non-fatal accident have been used as proxy information for the actual accident cost. There are two types of parameter estimates used in this study, which are maximum likelihood (MLE) and robust estimation. The findings of this study reveal that accident cost ratio for fatal and non-fatal claim when using MLE is 1.33, while, for robust estimates, the cost ratio is slightly higher which is 1.51. This study will help the authority to determine a more accurate cost ratio between fatal and non-fatal accident as compared to the official ratio set by the government, since cost ratio is an important element to be used as a weightage in modeling road accident related data. Therefore, this study provides some guidance tips to revise the insurance claim set by the Malaysia road authority, hence the appropriate method that suitable to implement in Malaysia can be analyzed.
Segmentation of the Speaker's Face Region with Audiovisual Correlation
NASA Astrophysics Data System (ADS)
Liu, Yuyu; Sato, Yoichi
The ability to find the speaker's face region in a video is useful for various applications. In this work, we develop a novel technique to find this region within different time windows, which is robust against the changes of view, scale, and background. The main thrust of our technique is to integrate audiovisual correlation analysis into a video segmentation framework. We analyze the audiovisual correlation locally by computing quadratic mutual information between our audiovisual features. The computation of quadratic mutual information is based on the probability density functions estimated by kernel density estimation with adaptive kernel bandwidth. The results of this audiovisual correlation analysis are incorporated into graph cut-based video segmentation to resolve a globally optimum extraction of the speaker's face region. The setting of any heuristic threshold in this segmentation is avoided by learning the correlation distributions of speaker and background by expectation maximization. Experimental results demonstrate that our method can detect the speaker's face region accurately and robustly for different views, scales, and backgrounds.
Causal Methods for Observational Research: A Primer.
Almasi-Hashiani, Amir; Nedjat, Saharnaz; Mansournia, Mohammad Ali
2018-04-01
The goal of many observational studies is to estimate the causal effect of an exposure on an outcome after adjustment for confounders, but there are still some serious errors in adjusting confounders in clinical journals. Standard regression modeling (e.g., ordinary logistic regression) fails to estimate the average effect of exposure in total population in the presence of interaction between exposure and covariates, and also cannot adjust for time-varying confounding appropriately. Moreover, stepwise algorithms of the selection of confounders based on P values may miss important confounders and lead to bias in effect estimates. Causal methods overcome these limitations. We illustrate three causal methods including inverse-probability-of-treatment-weighting (IPTW) and parametric g-formula, with an emphasis on a clever combination of these 2 methods: targeted maximum likelihood estimation (TMLE) which enjoys a double-robust property against bias. © 2018 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Technical note: Bayesian calibration of dynamic ruminant nutrition models.
Reed, K F; Arhonditsis, G B; France, J; Kebreab, E
2016-08-01
Mechanistic models of ruminant digestion and metabolism have advanced our understanding of the processes underlying ruminant animal physiology. Deterministic modeling practices ignore the inherent variation within and among individual animals and thus have no way to assess how sources of error influence model outputs. We introduce Bayesian calibration of mathematical models to address the need for robust mechanistic modeling tools that can accommodate error analysis by remaining within the bounds of data-based parameter estimation. For the purpose of prediction, the Bayesian approach generates a posterior predictive distribution that represents the current estimate of the value of the response variable, taking into account both the uncertainty about the parameters and model residual variability. Predictions are expressed as probability distributions, thereby conveying significantly more information than point estimates in regard to uncertainty. Our study illustrates some of the technical advantages of Bayesian calibration and discusses the future perspectives in the context of animal nutrition modeling. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Thompson, Robert S.; Anderson, Katherine H.; Pelltier, Richard T.; Strickland, Laura E.; Shafer, Sarah L.; Bartlein, Patrick J.
2012-01-01
Vegetation inventories (plant taxa present in a vegetation assemblage at a given site) can be used to estimate climatic parameters based on the identification of the range of a given parameter where all taxa in an assemblage overlap ("Mutual Climatic Range"). For the reconstruction of past climates from fossil or subfossil plant assemblages, we assembled the data necessary for such analyses for 530 woody plant taxa and eight climatic parameters in North America. Here we present examples of how these data can be used to obtain paleoclimatic estimates from botanical data in a straightforward, simple, and robust fashion. We also include matrices of climate parameter versus occurrence or nonoccurrence of the individual taxa. These relations are depicted graphically as histograms of the population distributions of the occurrences of a given taxon plotted against a given climatic parameter. This provides a new method for quantification of paleoclimatic parameters from fossil plant assemblages.
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
2017-01-01
This work investigates the design of alternative monitoring tools based on state estimators for industrial crystallization systems with nucleation, growth, and agglomeration kinetics. The estimation problem is regarded as a structure design problem where the estimation model and the set of innovated states have to be chosen; the estimator is driven by the available measurements of secondary variables. On the basis of Robust Exponential estimability arguments, it is found that the concentration is distinguishable with temperature and solid fraction measurements while the crystal size distribution (CSD) is not. Accordingly, a state estimator structure is selected such that (i) the concentration (and other distinguishable states) are innovated by means of the secondary measurements processed with the geometric estimator (GE), and (ii) the CSD is estimated by means of a rigorous model in open loop mode. The proposed estimator has been tested through simulations showing good performance in the case of mismatch in the initial conditions, parametric plant-model mismatch, and noisy measurements. PMID:28890604
Porru, Marcella; Özkan, Leyla
2017-08-30
This work investigates the design of alternative monitoring tools based on state estimators for industrial crystallization systems with nucleation, growth, and agglomeration kinetics. The estimation problem is regarded as a structure design problem where the estimation model and the set of innovated states have to be chosen; the estimator is driven by the available measurements of secondary variables. On the basis of Robust Exponential estimability arguments, it is found that the concentration is distinguishable with temperature and solid fraction measurements while the crystal size distribution (CSD) is not. Accordingly, a state estimator structure is selected such that (i) the concentration (and other distinguishable states) are innovated by means of the secondary measurements processed with the geometric estimator (GE), and (ii) the CSD is estimated by means of a rigorous model in open loop mode. The proposed estimator has been tested through simulations showing good performance in the case of mismatch in the initial conditions, parametric plant-model mismatch, and noisy measurements.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Hutson, Alan D
2018-01-01
In this note, we develop a new and novel semi-parametric estimator of the survival curve that is comparable to the product-limit estimator under very relaxed assumptions. The estimator is based on a beta parametrization that warps the empirical distribution of the observed censored and uncensored data. The parameters are obtained using a pseudo-maximum likelihood approach adjusting the survival curve accounting for the censored observations. In the univariate setting, the new estimator tends to better extend the range of the survival estimation given a high degree of censoring. However, the key feature of this paper is that we develop a new two-group semi-parametric exact permutation test for comparing survival curves that is generally superior to the classic log-rank and Wilcoxon tests and provides the best global power across a variety of alternatives. The new test is readily extended to the k group setting. PMID:26988931
Fournier, Auriel M. V.; Sullivan, Alexis R.; Bump, Joseph K.; Perkins, Marie; Shieldcastle, Mark C.; King, Sammy L.
2017-01-01
Stable hydrogen isotope (δD) methods for tracking animal movement are widely used yet often produce low resolution assignments. Incorporating prior knowledge of abundance, distribution or movement patterns can ameliorate this limitation, but data are lacking for most species. We demonstrate how observations reported by citizen scientists can be used to develop robust estimates of species distributions and to constrain δD assignments.We developed a Bayesian framework to refine isotopic estimates of migrant animal origins conditional on species distribution models constructed from citizen scientist observations. To illustrate this approach, we analysed the migratory connectivity of the Virginia rail Rallus limicola, a secretive and declining migratory game bird in North America.Citizen science observations enabled both estimation of sampling bias and construction of bias-corrected species distribution models. Conditioning δD assignments on these species distribution models yielded comparably high-resolution assignments.Most Virginia rails wintering across five Gulf Coast sites spent the previous summer near the Great Lakes, although a considerable minority originated from the Chesapeake Bay watershed or Prairie Pothole region of North Dakota. Conversely, the majority of migrating Virginia rails from a site in the Great Lakes most likely spent the previous winter on the Gulf Coast between Texas and Louisiana.Synthesis and applications. In this analysis, Virginia rail migratory connectivity does not fully correspond to the administrative flyways used to manage migratory birds. This example demonstrates that with the increasing availability of citizen science data to create species distribution models, our framework can produce high-resolution estimates of migratory connectivity for many animals, including cryptic species. Empirical evidence of links between seasonal habitats will help enable effective habitat management, hunting quotas and population monitoring and also highlight critical knowledge gaps.
Maximum likelihood solution for inclination-only data in paleomagnetism
NASA Astrophysics Data System (ADS)
Arason, P.; Levi, S.
2010-08-01
We have developed a new robust maximum likelihood method for estimating the unbiased mean inclination from inclination-only data. In paleomagnetic analysis, the arithmetic mean of inclination-only data is known to introduce a shallowing bias. Several methods have been introduced to estimate the unbiased mean inclination of inclination-only data together with measures of the dispersion. Some inclination-only methods were designed to maximize the likelihood function of the marginal Fisher distribution. However, the exact analytical form of the maximum likelihood function is fairly complicated, and all the methods require various assumptions and approximations that are often inappropriate. For some steep and dispersed data sets, these methods provide estimates that are significantly displaced from the peak of the likelihood function to systematically shallower inclination. The problem locating the maximum of the likelihood function is partly due to difficulties in accurately evaluating the function for all values of interest, because some elements of the likelihood function increase exponentially as precision parameters increase, leading to numerical instabilities. In this study, we succeeded in analytically cancelling exponential elements from the log-likelihood function, and we are now able to calculate its value anywhere in the parameter space and for any inclination-only data set. Furthermore, we can now calculate the partial derivatives of the log-likelihood function with desired accuracy, and locate the maximum likelihood without the assumptions required by previous methods. To assess the reliability and accuracy of our method, we generated large numbers of random Fisher-distributed data sets, for which we calculated mean inclinations and precision parameters. The comparisons show that our new robust Arason-Levi maximum likelihood method is the most reliable, and the mean inclination estimates are the least biased towards shallow values.
Bobb, Jennifer F; Dominici, Francesca; Peng, Roger D
2011-12-01
Estimating the risks heat waves pose to human health is a critical part of assessing the future impact of climate change. In this article, we propose a flexible class of time series models to estimate the relative risk of mortality associated with heat waves and conduct Bayesian model averaging (BMA) to account for the multiplicity of potential models. Applying these methods to data from 105 U.S. cities for the period 1987-2005, we identify those cities having a high posterior probability of increased mortality risk during heat waves, examine the heterogeneity of the posterior distributions of mortality risk across cities, assess sensitivity of the results to the selection of prior distributions, and compare our BMA results to a model selection approach. Our results show that no single model best predicts risk across the majority of cities, and that for some cities heat-wave risk estimation is sensitive to model choice. Although model averaging leads to posterior distributions with increased variance as compared to statistical inference conditional on a model obtained through model selection, we find that the posterior mean of heat wave mortality risk is robust to accounting for model uncertainty over a broad class of models. © 2011, The International Biometric Society.
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
NASA Technical Reports Server (NTRS)
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
Robust location and spread measures for nonparametric probability density function estimation.
López-Rubio, Ezequiel
2009-10-01
Robustness against outliers is a desirable property of any unsupervised learning scheme. In particular, probability density estimators benefit from incorporating this feature. A possible strategy to achieve this goal is to substitute the sample mean and the sample covariance matrix by more robust location and spread estimators. Here we use the L1-median to develop a nonparametric probability density function (PDF) estimator. We prove its most relevant properties, and we show its performance in density estimation and classification applications.
NASA Astrophysics Data System (ADS)
Cannaday, Ashley E.; Draham, Robert; Berger, Andrew J.
2016-04-01
The goal of this project is to estimate non-nuclear organelle size distributions in single cells by measuring angular scattering patterns and fitting them with Mie theory. Simulations have indicated that the large relative size distribution of organelles (mean:width≈2) leads to unstable Mie fits unless scattering is collected at polar angles less than 20 degrees. Our optical system has therefore been modified to collect angles down to 10 degrees. Initial validations will be performed on polystyrene bead populations whose size distributions resemble those of cell organelles. Unlike with the narrow bead distributions that are often used for calibration, we expect to see an order-of-magnitude improvement in the stability of the size estimates as the minimum angle decreases from 20 to 10 degrees. Scattering patterns will then be acquired and analyzed from single cells (EMT6 mouse cancer cells), both fixed and live, at multiple time points. Fixed cells, with no changes in organelle sizes over time, will be measured to determine the fluctuation level in estimated size distribution due to measurement imperfections alone. Subsequent measurements on live cells will determine whether there is a higher level of fluctuation that could be attributed to dynamic changes in organelle size. Studies on unperturbed cells are precursors to ones in which the effects of exogenous agents are monitored over time.
NASA Astrophysics Data System (ADS)
Larson, B. I.; Houghton, J. L.; Lowell, R. P.; Farough, A.; Meile, C. D.
2015-08-01
Chemical gradients in the subsurface of mid-ocean ridge hydrothermal systems create an environment where minerals precipitate and dissolve and where chemosynthetic organisms thrive. However, owing to the lack of easy access to the subsurface, robust knowledge of the nature and extent of chemical transformations remains elusive. Here, we combine measurements of vent fluid chemistry with geochemical and transport modeling to give new insights into the under-sampled subsurface. Temperature-composition relationships from a geochemical mixing model are superimposed on the subsurface temperature distribution determined using a heat flow model to estimate the spatial distribution of fluid composition. We then estimate the distribution of Gibb's free energies of reaction beneath mid oceanic ridges and by combining flow simulations with speciation calculations estimate anhydrite deposition rates. Applied to vent endmembers observed at the fast spreading ridge at the East Pacific Rise, our results suggest that sealing times due to anhydrite formation are longer than the typical time between tectonic and magmatic events. The chemical composition of the neighboring low temperature flow indicates relatively uniform energetically favorable conditions for commonly inferred microbial processes such as methanogenesis, sulfate reduction and numerous oxidation reactions, suggesting that factors other than energy availability may control subsurface microbial biomass distribution. Thus, these model simulations complement fluid-sample datasets from surface venting and help infer the chemical distribution and transformations in subsurface flow.
FracFit: A Robust Parameter Estimation Tool for Anomalous Transport Problems
NASA Astrophysics Data System (ADS)
Kelly, J. F.; Bolster, D.; Meerschaert, M. M.; Drummond, J. D.; Packman, A. I.
2016-12-01
Anomalous transport cannot be adequately described with classical Fickian advection-dispersion equations (ADE). Rather, fractional calculus models may be used, which capture non-Fickian behavior (e.g. skewness and power-law tails). FracFit is a robust parameter estimation tool based on space- and time-fractional models used to model anomalous transport. Currently, four fractional models are supported: 1) space fractional advection-dispersion equation (sFADE), 2) time-fractional dispersion equation with drift (TFDE), 3) fractional mobile-immobile equation (FMIE), and 4) tempered fractional mobile-immobile equation (TFMIE); additional models may be added in the future. Model solutions using pulse initial conditions and continuous injections are evaluated using stable distribution PDFs and CDFs or subordination integrals. Parameter estimates are extracted from measured breakthrough curves (BTCs) using a weighted nonlinear least squares (WNLS) algorithm. Optimal weights for BTCs for pulse initial conditions and continuous injections are presented, facilitating the estimation of power-law tails. Two sample applications are analyzed: 1) continuous injection laboratory experiments using natural organic matter and 2) pulse injection BTCs in the Selke river. Model parameters are compared across models and goodness-of-fit metrics are presented, assisting model evaluation. The sFADE and time-fractional models are compared using space-time duality (Baeumer et. al., 2009), which links the two paradigms.
Histogram equalization with Bayesian estimation for noise robust speech recognition.
Suh, Youngjoo; Kim, Hoirin
2018-02-01
The histogram equalization approach is an efficient feature normalization technique for noise robust automatic speech recognition. However, it suffers from performance degradation when some fundamental conditions are not satisfied in the test environment. To remedy these limitations of the original histogram equalization methods, class-based histogram equalization approach has been proposed. Although this approach showed substantial performance improvement under noise environments, it still suffers from performance degradation due to the overfitting problem when test data are insufficient. To address this issue, the proposed histogram equalization technique employs the Bayesian estimation method in the test cumulative distribution function estimation. It was reported in a previous study conducted on the Aurora-4 task that the proposed approach provided substantial performance gains in speech recognition systems based on the acoustic modeling of the Gaussian mixture model-hidden Markov model. In this work, the proposed approach was examined in speech recognition systems with deep neural network-hidden Markov model (DNN-HMM), the current mainstream speech recognition approach where it also showed meaningful performance improvement over the conventional maximum likelihood estimation-based method. The fusion of the proposed features with the mel-frequency cepstral coefficients provided additional performance gains in DNN-HMM systems, which otherwise suffer from performance degradation in the clean test condition.
NASA Astrophysics Data System (ADS)
Wübbeler, Gerd; Bodnar, Olha; Elster, Clemens
2018-02-01
Weighted least-squares estimation is commonly applied in metrology to fit models to measurements that are accompanied with quoted uncertainties. The weights are chosen in dependence on the quoted uncertainties. However, when data and model are inconsistent in view of the quoted uncertainties, this procedure does not yield adequate results. When it can be assumed that all uncertainties ought to be rescaled by a common factor, weighted least-squares estimation may still be used, provided that a simple correction of the uncertainty obtained for the estimated model is applied. We show that these uncertainties and credible intervals are robust, as they do not rely on the assumption of a Gaussian distribution of the data. Hence, common software for weighted least-squares estimation may still safely be employed in such a case, followed by a simple modification of the uncertainties obtained by that software. We also provide means of checking the assumptions of such an approach. The Bayesian regression procedure is applied to analyze the CODATA values for the Planck constant published over the past decades in terms of three different models: a constant model, a straight line model and a spline model. Our results indicate that the CODATA values may not have yet stabilized.
Uncertainty Estimation in Tsunami Initial Condition From Rapid Bayesian Finite Fault Modeling
NASA Astrophysics Data System (ADS)
Benavente, R. F.; Dettmer, J.; Cummins, P. R.; Urrutia, A.; Cienfuegos, R.
2017-12-01
It is well known that kinematic rupture models for a given earthquake can present discrepancies even when similar datasets are employed in the inversion process. While quantifying this variability can be critical when making early estimates of the earthquake and triggered tsunami impact, "most likely models" are normally used for this purpose. In this work, we quantify the uncertainty of the tsunami initial condition for the great Illapel earthquake (Mw = 8.3, 2015, Chile). We focus on utilizing data and inversion methods that are suitable to rapid source characterization yet provide meaningful and robust results. Rupture models from teleseismic body and surface waves as well as W-phase are derived and accompanied by Bayesian uncertainty estimates from linearized inversion under positivity constraints. We show that robust and consistent features about the rupture kinematics appear when working within this probabilistic framework. Moreover, by using static dislocation theory, we translate the probabilistic slip distributions into seafloor deformation which we interpret as a tsunami initial condition. After considering uncertainty, our probabilistic seafloor deformation models obtained from different data types appear consistent with each other providing meaningful results. We also show that selecting just a single "representative" solution from the ensemble of initial conditions for tsunami propagation may lead to overestimating information content in the data. Our results suggest that rapid, probabilistic rupture models can play a significant role during emergency response by providing robust information about the extent of the disaster.
Robust nonlinear system identification: Bayesian mixture of experts using the t-distribution
NASA Astrophysics Data System (ADS)
Baldacchino, Tara; Worden, Keith; Rowson, Jennifer
2017-02-01
A novel variational Bayesian mixture of experts model for robust regression of bifurcating and piece-wise continuous processes is introduced. The mixture of experts model is a powerful model which probabilistically splits the input space allowing different models to operate in the separate regions. However, current methods have no fail-safe against outliers. In this paper, a robust mixture of experts model is proposed which consists of Student-t mixture models at the gates and Student-t distributed experts, trained via Bayesian inference. The Student-t distribution has heavier tails than the Gaussian distribution, and so it is more robust to outliers, noise and non-normality in the data. Using both simulated data and real data obtained from the Z24 bridge this robust mixture of experts performs better than its Gaussian counterpart when outliers are present. In particular, it provides robustness to outliers in two forms: unbiased parameter regression models, and robustness to overfitting/complex models.
Brazzale, Alessandra R; Küchenhoff, Helmut; Krügel, Stefanie; Schiergens, Tobias S; Trentzsch, Heiko; Hartl, Wolfgang
2018-04-05
We present a new method for estimating a change point in the hazard function of a survival distribution assuming a constant hazard rate after the change point and a decreasing hazard rate before the change point. Our method is based on fitting a stump regression to p values for testing hazard rates in small time intervals. We present three real data examples describing survival patterns of severely ill patients, whose excess mortality rates are known to persist far beyond hospital discharge. For designing survival studies in these patients and for the definition of hospital performance metrics (e.g. mortality), it is essential to define adequate and objective end points. The reliable estimation of a change point will help researchers to identify such end points. By precisely knowing this change point, clinicians can distinguish between the acute phase with high hazard (time elapsed after admission and before the change point was reached), and the chronic phase (time elapsed after the change point) in which hazard is fairly constant. We show in an extensive simulation study that maximum likelihood estimation is not robust in this setting, and we evaluate our new estimation strategy including bootstrap confidence intervals and finite sample bias correction.
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
A synthetic phylogeny of freshwater crayfish: insights for conservation.
Owen, Christopher L; Bracken-Grissom, Heather; Stern, David; Crandall, Keith A
2015-02-19
Phylogenetic systematics is heading for a renaissance where we shift from considering our phylogenetic estimates as a static image in a published paper and taxonomies as a hardcopy checklist to treating both the phylogenetic estimate and dynamic taxonomies as metadata for further analyses. The Open Tree of Life project (opentreeoflife.org) is developing synthesis tools for harnessing the power of phylogenetic inference and robust taxonomy to develop a synthetic tree of life. We capitalize on this approach to estimate a synthesis tree for the freshwater crayfish. The crayfish make an exceptional group to demonstrate the utility of the synthesis approach, as there recently have been a number of phylogenetic studies on the crayfishes along with a robust underlying taxonomic framework. Importantly, the crayfish have also been extensively assessed by an IUCN Red List team and therefore have accurate and up-to-date area and conservation status data available for analysis within a phylogenetic context. Here, we develop a synthesis phylogeny for the world's freshwater crayfish and examine the phylogenetic distribution of threat. We also estimate a molecular phylogeny based on all available GenBank crayfish sequences and use this tree to estimate divergence times and test for divergence rate variation. Finally, we conduct EDGE and HEDGE analyses and identify a number of species of freshwater crayfish of highest priority in conservation efforts. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
A synthetic phylogeny of freshwater crayfish: insights for conservation
Owen, Christopher L.; Bracken-Grissom, Heather; Stern, David; Crandall, Keith A.
2015-01-01
Phylogenetic systematics is heading for a renaissance where we shift from considering our phylogenetic estimates as a static image in a published paper and taxonomies as a hardcopy checklist to treating both the phylogenetic estimate and dynamic taxonomies as metadata for further analyses. The Open Tree of Life project (opentreeoflife.org) is developing synthesis tools for harnessing the power of phylogenetic inference and robust taxonomy to develop a synthetic tree of life. We capitalize on this approach to estimate a synthesis tree for the freshwater crayfish. The crayfish make an exceptional group to demonstrate the utility of the synthesis approach, as there recently have been a number of phylogenetic studies on the crayfishes along with a robust underlying taxonomic framework. Importantly, the crayfish have also been extensively assessed by an IUCN Red List team and therefore have accurate and up-to-date area and conservation status data available for analysis within a phylogenetic context. Here, we develop a synthesis phylogeny for the world's freshwater crayfish and examine the phylogenetic distribution of threat. We also estimate a molecular phylogeny based on all available GenBank crayfish sequences and use this tree to estimate divergence times and test for divergence rate variation. Finally, we conduct EDGE and HEDGE analyses and identify a number of species of freshwater crayfish of highest priority in conservation efforts. PMID:25561670
Robust stochastic optimization for reservoir operation
NASA Astrophysics Data System (ADS)
Pan, Limeng; Housh, Mashor; Liu, Pan; Cai, Ximing; Chen, Xin
2015-01-01
Optimal reservoir operation under uncertainty is a challenging engineering problem. Application of classic stochastic optimization methods to large-scale problems is limited due to computational difficulty. Moreover, classic stochastic methods assume that the estimated distribution function or the sample inflow data accurately represents the true probability distribution, which may be invalid and the performance of the algorithms may be undermined. In this study, we introduce a robust optimization (RO) approach, Iterative Linear Decision Rule (ILDR), so as to provide a tractable approximation for a multiperiod hydropower generation problem. The proposed approach extends the existing LDR method by accommodating nonlinear objective functions. It also provides users with the flexibility of choosing the accuracy of ILDR approximations by assigning a desired number of piecewise linear segments to each uncertainty. The performance of the ILDR is compared with benchmark policies including the sampling stochastic dynamic programming (SSDP) policy derived from historical data. The ILDR solves both the single and multireservoir systems efficiently. The single reservoir case study results show that the RO method is as good as SSDP when implemented on the original historical inflows and it outperforms SSDP policy when tested on generated inflows with the same mean and covariance matrix as those in history. For the multireservoir case study, which considers water supply in addition to power generation, numerical results show that the proposed approach performs as well as in the single reservoir case study in terms of optimal value and distributional robustness.
Ellipsoids for anomaly detection in remote sensing imagery
NASA Astrophysics Data System (ADS)
Grosklos, Guenchik; Theiler, James
2015-05-01
For many target and anomaly detection algorithms, a key step is the estimation of a centroid (relatively easy) and a covariance matrix (somewhat harder) that characterize the background clutter. For a background that can be modeled as a multivariate Gaussian, the centroid and covariance lead to an explicit probability density function that can be used in likelihood ratio tests for optimal detection statistics. But ellipsoidal contours can characterize a much larger class of multivariate density function, and the ellipsoids that characterize the outer periphery of the distribution are most appropriate for detection in the low false alarm rate regime. Traditionally the sample mean and sample covariance are used to estimate ellipsoid location and shape, but these quantities are confounded both by large lever-arm outliers and non-Gaussian distributions within the ellipsoid of interest. This paper compares a variety of centroid and covariance estimation schemes with the aim of characterizing the periphery of the background distribution. In particular, we will consider a robust variant of the Khachiyan algorithm for minimum-volume enclosing ellipsoid. The performance of these different approaches is evaluated on multispectral and hyperspectral remote sensing imagery using coverage plots of ellipsoid volume versus false alarm rate.
Robust Portfolio Optimization Using Pseudodistances.
Toma, Aida; Leoni-Aubin, Samuela
2015-01-01
The presence of outliers in financial asset returns is a frequently occurring phenomenon which may lead to unreliable mean-variance optimized portfolios. This fact is due to the unbounded influence that outliers can have on the mean returns and covariance estimators that are inputs in the optimization procedure. In this paper we present robust estimators of mean and covariance matrix obtained by minimizing an empirical version of a pseudodistance between the assumed model and the true model underlying the data. We prove and discuss theoretical properties of these estimators, such as affine equivariance, B-robustness, asymptotic normality and asymptotic relative efficiency. These estimators can be easily used in place of the classical estimators, thereby providing robust optimized portfolios. A Monte Carlo simulation study and applications to real data show the advantages of the proposed approach. We study both in-sample and out-of-sample performance of the proposed robust portfolios comparing them with some other portfolios known in literature.
Robust Portfolio Optimization Using Pseudodistances
2015-01-01
The presence of outliers in financial asset returns is a frequently occurring phenomenon which may lead to unreliable mean-variance optimized portfolios. This fact is due to the unbounded influence that outliers can have on the mean returns and covariance estimators that are inputs in the optimization procedure. In this paper we present robust estimators of mean and covariance matrix obtained by minimizing an empirical version of a pseudodistance between the assumed model and the true model underlying the data. We prove and discuss theoretical properties of these estimators, such as affine equivariance, B-robustness, asymptotic normality and asymptotic relative efficiency. These estimators can be easily used in place of the classical estimators, thereby providing robust optimized portfolios. A Monte Carlo simulation study and applications to real data show the advantages of the proposed approach. We study both in-sample and out-of-sample performance of the proposed robust portfolios comparing them with some other portfolios known in literature. PMID:26468948
NASA Astrophysics Data System (ADS)
Li, Qiang; Zhang, Ying; Lin, Jingran; Wu, Sissi Xiaoxiao
2017-09-01
Consider a full-duplex (FD) bidirectional secure communication system, where two communication nodes, named Alice and Bob, simultaneously transmit and receive confidential information from each other, and an eavesdropper, named Eve, overhears the transmissions. Our goal is to maximize the sum secrecy rate (SSR) of the bidirectional transmissions by optimizing the transmit covariance matrices at Alice and Bob. To tackle this SSR maximization (SSRM) problem, we develop an alternating difference-of-concave (ADC) programming approach to alternately optimize the transmit covariance matrices at Alice and Bob. We show that the ADC iteration has a semi-closed-form beamforming solution, and is guaranteed to converge to a stationary solution of the SSRM problem. Besides the SSRM design, this paper also deals with a robust SSRM transmit design under a moment-based random channel state information (CSI) model, where only some roughly estimated first and second-order statistics of Eve's CSI are available, but the exact distribution or other high-order statistics is not known. This moment-based error model is new and different from the widely used bounded-sphere error model and the Gaussian random error model. Under the consider CSI error model, the robust SSRM is formulated as an outage probability-constrained SSRM problem. By leveraging the Lagrangian duality theory and DC programming, a tractable safe solution to the robust SSRM problem is derived. The effectiveness and the robustness of the proposed designs are demonstrated through simulations.
Preprocessing of gene expression data by optimally robust estimators
2010-01-01
Background The preprocessing of gene expression data obtained from several platforms routinely includes the aggregation of multiple raw signal intensities to one expression value. Examples are the computation of a single expression measure based on the perfect match (PM) and mismatch (MM) probes for the Affymetrix technology, the summarization of bead level values to bead summary values for the Illumina technology or the aggregation of replicated measurements in the case of other technologies including real-time quantitative polymerase chain reaction (RT-qPCR) platforms. The summarization of technical replicates is also performed in other "-omics" disciplines like proteomics or metabolomics. Preprocessing methods like MAS 5.0, Illumina's default summarization method, RMA, or VSN show that the use of robust estimators is widely accepted in gene expression analysis. However, the selection of robust methods seems to be mainly driven by their high breakdown point and not by efficiency. Results We describe how optimally robust radius-minimax (rmx) estimators, i.e. estimators that minimize an asymptotic maximum risk on shrinking neighborhoods about an ideal model, can be used for the aggregation of multiple raw signal intensities to one expression value for Affymetrix and Illumina data. With regard to the Affymetrix data, we have implemented an algorithm which is a variant of MAS 5.0. Using datasets from the literature and Monte-Carlo simulations we provide some reasoning for assuming approximate log-normal distributions of the raw signal intensities by means of the Kolmogorov distance, at least for the discussed datasets, and compare the results of our preprocessing algorithms with the results of Affymetrix's MAS 5.0 and Illumina's default method. The numerical results indicate that when using rmx estimators an accuracy improvement of about 10-20% is obtained compared to Affymetrix's MAS 5.0 and about 1-5% compared to Illumina's default method. The improvement is also visible in the analysis of technical replicates where the reproducibility of the values (in terms of Pearson and Spearman correlation) is increased for all Affymetrix and almost all Illumina examples considered. Our algorithms are implemented in the R package named RobLoxBioC which is publicly available via CRAN, The Comprehensive R Archive Network (http://cran.r-project.org/web/packages/RobLoxBioC/). Conclusions Optimally robust rmx estimators have a high breakdown point and are computationally feasible. They can lead to a considerable gain in efficiency for well-established bioinformatics procedures and thus, can increase the reproducibility and power of subsequent statistical analysis. PMID:21118506
NASA Astrophysics Data System (ADS)
Wang, Daosheng; Zhang, Jicai; He, Xianqiang; Chu, Dongdong; Lv, Xianqing; Wang, Ya Ping; Yang, Yang; Fan, Daidu; Gao, Shu
2018-01-01
Model parameters in the suspended cohesive sediment transport models are critical for the accurate simulation of suspended sediment concentrations (SSCs). Difficulties in estimating the model parameters still prevent numerical modeling of the sediment transport from achieving a high level of predictability. Based on a three-dimensional cohesive sediment transport model and its adjoint model, the satellite remote sensing data of SSCs during both spring tide and neap tide, retrieved from Geostationary Ocean Color Imager (GOCI), are assimilated to synchronously estimate four spatially and temporally varying parameters in the Hangzhou Bay in China, including settling velocity, resuspension rate, inflow open boundary conditions and initial conditions. After data assimilation, the model performance is significantly improved. Through several sensitivity experiments, the spatial and temporal variation tendencies of the estimated model parameters are verified to be robust and not affected by model settings. The pattern for the variations of the estimated parameters is analyzed and summarized. The temporal variations and spatial distributions of the estimated settling velocity are negatively correlated with current speed, which can be explained using the combination of flocculation process and Stokes' law. The temporal variations and spatial distributions of the estimated resuspension rate are also negatively correlated with current speed, which are related to the grain size of the seabed sediments under different current velocities. Besides, the estimated inflow open boundary conditions reach the local maximum values near the low water slack conditions and the estimated initial conditions are negatively correlated with water depth, which is consistent with the general understanding. The relationships between the estimated parameters and the hydrodynamic fields can be suggestive for improving the parameterization in cohesive sediment transport models.
Motion estimation of magnetic resonance cardiac images using the Wigner-Ville and hough transforms
NASA Astrophysics Data System (ADS)
Carranza, N.; Cristóbal, G.; Bayerl, P.; Neumann, H.
2007-12-01
Myocardial motion analysis and quantification is of utmost importance for analyzing contractile heart abnormalities and it can be a symptom of a coronary artery disease. A fundamental problem in processing sequences of images is the computation of the optical flow, which is an approximation of the real image motion. This paper presents a new algorithm for optical flow estimation based on a spatiotemporal-frequency (STF) approach. More specifically it relies on the computation of the Wigner-Ville distribution (WVD) and the Hough Transform (HT) of the motion sequences. The latter is a well-known line and shape detection method that is highly robust against incomplete data and noise. The rationale of using the HT in this context is that it provides a value of the displacement field from the STF representation. In addition, a probabilistic approach based on Gaussian mixtures has been implemented in order to improve the accuracy of the motion detection. Experimental results in the case of synthetic sequences are compared with an implementation of the variational technique for local and global motion estimation, where it is shown that the results are accurate and robust to noise degradations. Results obtained with real cardiac magnetic resonance images are presented.
NASA Astrophysics Data System (ADS)
Carranza, N.; Cristóbal, G.; Sroubek, F.; Ledesma-Carbayo, M. J.; Santos, A.
2006-08-01
Myocardial motion analysis and quantification is of utmost importance for analyzing contractile heart abnormalities and it can be a symptom of a coronary artery disease. A fundamental problem in processing sequences of images is the computation of the optical flow, which is an approximation to the real image motion. This paper presents a new algorithm for optical flow estimation based on a spatiotemporal-frequency (STF) approach, more specifically on the computation of the Wigner-Ville distribution (WVD) and the Hough Transform (HT) of the motion sequences. The later is a well-known line and shape detection method very robust against incomplete data and noise. The rationale of using the HT in this context is because it provides a value of the displacement field from the STF representation. In addition, a probabilistic approach based on Gaussian mixtures has been implemented in order to improve the accuracy of the motion detection. Experimental results with synthetic sequences are compared against an implementation of the variational technique for local and global motion estimation, where it is shown that the results obtained here are accurate and robust to noise degradations. Real cardiac magnetic resonance images have been tested and evaluated with the current method.
Xu, Maoqi; Chen, Liang
2018-01-01
The individual sample heterogeneity is one of the biggest obstacles in biomarker identification for complex diseases such as cancers. Current statistical models to identify differentially expressed genes between disease and control groups often overlook the substantial human sample heterogeneity. Meanwhile, traditional nonparametric tests lose detailed data information and sacrifice the analysis power, although they are distribution free and robust to heterogeneity. Here, we propose an empirical likelihood ratio test with a mean-variance relationship constraint (ELTSeq) for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
LENSED: a code for the forward reconstruction of lenses and sources from strong lensing observations
NASA Astrophysics Data System (ADS)
Tessore, Nicolas; Bellagamba, Fabio; Metcalf, R. Benton
2016-12-01
Robust modelling of strong lensing systems is fundamental to exploit the information they contain about the distribution of matter in galaxies and clusters. In this work, we present LENSED, a new code which performs forward parametric modelling of strong lenses. LENSED takes advantage of a massively parallel ray-tracing kernel to perform the necessary calculations on a modern graphics processing unit (GPU). This makes the precise rendering of the background lensed sources much faster, and allows the simultaneous optimization of tens of parameters for the selected model. With a single run, the code is able to obtain the full posterior probability distribution for the lens light, the mass distribution and the background source at the same time. LENSED is first tested on mock images which reproduce realistic space-based observations of lensing systems. In this way, we show that it is able to recover unbiased estimates of the lens parameters, even when the sources do not follow exactly the assumed model. Then, we apply it to a subsample of the Sloan Lens ACS Survey lenses, in order to demonstrate its use on real data. The results generally agree with the literature, and highlight the flexibility and robustness of the algorithm.
Robust point matching via vector field consensus.
Jiayi Ma; Ji Zhao; Jinwen Tian; Yuille, Alan L; Zhuowen Tu
2014-04-01
In this paper, we propose an efficient algorithm, called vector field consensus, for establishing robust point correspondences between two sets of points. Our algorithm starts by creating a set of putative correspondences which can contain a very large number of false correspondences, or outliers, in addition to a limited number of true correspondences (inliers). Next, we solve for correspondence by interpolating a vector field between the two point sets, which involves estimating a consensus of inlier points whose matching follows a nonparametric geometrical constraint. We formulate this a maximum a posteriori (MAP) estimation of a Bayesian model with hidden/latent variables indicating whether matches in the putative set are outliers or inliers. We impose nonparametric geometrical constraints on the correspondence, as a prior distribution, using Tikhonov regularizers in a reproducing kernel Hilbert space. MAP estimation is performed by the EM algorithm which by also estimating the variance of the prior model (initialized to a large value) is able to obtain good estimates very quickly (e.g., avoiding many of the local minima inherent in this formulation). We illustrate this method on data sets in 2D and 3D and demonstrate that it is robust to a very large number of outliers (even up to 90%). We also show that in the special case where there is an underlying parametric geometrical model (e.g., the epipolar line constraint) that we obtain better results than standard alternatives like RANSAC if a large number of outliers are present. This suggests a two-stage strategy, where we use our nonparametric model to reduce the size of the putative set and then apply a parametric variant of our approach to estimate the geometric parameters. Our algorithm is computationally efficient and we provide code for others to use it. In addition, our approach is general and can be applied to other problems, such as learning with a badly corrupted training data set.
Sehgal, Muhammad Shoaib B; Gondal, Iqbal; Dooley, Laurence S
2005-05-15
Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algorithms have been proposed, more robust techniques need to be developed so that further analysis of biological data can be accurately undertaken. In this paper, an innovative missing value imputation algorithm called collateral missing value estimation (CMVE) is presented which uses multiple covariance-based imputation matrices for the final prediction of missing values. The matrices are computed and optimized using least square regression and linear programming methods. The new CMVE algorithm has been compared with existing estimation techniques including Bayesian principal component analysis imputation (BPCA), least square impute (LSImpute) and K-nearest neighbour (KNN). All these methods were rigorously tested to estimate missing values in three separate non-time series (ovarian cancer based) and one time series (yeast sporulation) dataset. Each method was quantitatively analyzed using the normalized root mean square (NRMS) error measure, covering a wide range of randomly introduced missing value probabilities from 0.01 to 0.2. Experiments were also undertaken on the yeast dataset, which comprised 1.7% actual missing values, to test the hypothesis that CMVE performed better not only for randomly occurring but also for a real distribution of missing values. The results confirmed that CMVE consistently demonstrated superior and robust estimation capability of missing values compared with other methods for both series types of data, for the same order of computational complexity. A concise theoretical framework has also been formulated to validate the improved performance of the CMVE algorithm. The CMVE software is available upon request from the authors.
Method for hyperspectral imagery exploitation and pixel spectral unmixing
NASA Technical Reports Server (NTRS)
Lin, Ching-Fang (Inventor)
2003-01-01
An efficiently hybrid approach to exploit hyperspectral imagery and unmix spectral pixels. This hybrid approach uses a genetic algorithm to solve the abundance vector for the first pixel of a hyperspectral image cube. This abundance vector is used as initial state in a robust filter to derive the abundance estimate for the next pixel. By using Kalman filter, the abundance estimate for a pixel can be obtained in one iteration procedure which is much fast than genetic algorithm. The output of the robust filter is fed to genetic algorithm again to derive accurate abundance estimate for the current pixel. The using of robust filter solution as starting point of the genetic algorithm speeds up the evolution of the genetic algorithm. After obtaining the accurate abundance estimate, the procedure goes to next pixel, and uses the output of genetic algorithm as the previous state estimate to derive abundance estimate for this pixel using robust filter. And again use the genetic algorithm to derive accurate abundance estimate efficiently based on the robust filter solution. This iteration continues until pixels in a hyperspectral image cube end.
ERIC Educational Resources Information Center
Tanner-Smith, Emily E.; Tipton, Elizabeth
2014-01-01
Methodologists have recently proposed robust variance estimation as one way to handle dependent effect sizes in meta-analysis. Software macros for robust variance estimation in meta-analysis are currently available for Stata (StataCorp LP, College Station, TX, USA) and SPSS (IBM, Armonk, NY, USA), yet there is little guidance for authors regarding…
Tissue Viscoelasticity Imaging Using Vibration and Ultrasound Coupler Gel
NASA Astrophysics Data System (ADS)
Yamakawa, Makoto; Shiina, Tsuyoshi
2012-07-01
In tissue diagnosis, both elasticity and viscosity are important indexes. Therefore, we propose a method for evaluating tissue viscoelasticity by applying vibration that is usually performed in elastography and using an ultrasound coupler gel with known viscoelasticity. In this method, we use three viscoelasticity parameters based on the coupler strain and tissue strain: the strain ratio as an elasticity parameter, and the phase difference and the normalized hysteresis loop area as viscosity parameters. In the agar phantom experiment, using these viscoelasticity parameters, we were able to estimate the viscoelasticity distribution of the phantom. In particular, the strain ratio and the phase difference were robust to strain estimation error.
Robust shot-noise measurement for continuous-variable quantum key distribution
NASA Astrophysics Data System (ADS)
Kunz-Jacques, Sébastien; Jouguet, Paul
2015-02-01
We study a practical method to measure the shot noise in real time in continuous-variable quantum key distribution systems. The amount of secret key that can be extracted from the raw statistics depends strongly on this quantity since it affects in particular the computation of the excess noise (i.e., noise in excess of the shot noise) added by an eavesdropper on the quantum channel. Some powerful quantum hacking attacks relying on faking the estimated value of the shot noise to hide an intercept and resend strategy were proposed. Here, we provide experimental evidence that our method can defeat the saturation attack and the wavelength attack.
NASA Astrophysics Data System (ADS)
Raj, Rahul; Hamm, Nicholas Alexander Samuel; van der Tol, Christiaan; Stein, Alfred
2016-03-01
Gross primary production (GPP) can be separated from flux tower measurements of net ecosystem exchange (NEE) of CO2. This is used increasingly to validate process-based simulators and remote-sensing-derived estimates of simulated GPP at various time steps. Proper validation includes the uncertainty associated with this separation. In this study, uncertainty assessment was done in a Bayesian framework. It was applied to data from the Speulderbos forest site, The Netherlands. We estimated the uncertainty in GPP at half-hourly time steps, using a non-rectangular hyperbola (NRH) model for its separation from the flux tower measurements. The NRH model provides a robust empirical relationship between radiation and GPP. It includes the degree of curvature of the light response curve, radiation and temperature. Parameters of the NRH model were fitted to the measured NEE data for every 10-day period during the growing season (April to October) in 2009. We defined the prior distribution of each NRH parameter and used Markov chain Monte Carlo (MCMC) simulation to estimate the uncertainty in the separated GPP from the posterior distribution at half-hourly time steps. This time series also allowed us to estimate the uncertainty at daily time steps. We compared the informative with the non-informative prior distributions of the NRH parameters and found that both choices produced similar posterior distributions of GPP. This will provide relevant and important information for the validation of process-based simulators in the future. Furthermore, the obtained posterior distributions of NEE and the NRH parameters are of interest for a range of applications.
Covariate selection with group lasso and doubly robust estimation of causal effects
Koch, Brandon; Vock, David M.; Wolfson, Julian
2017-01-01
Summary The efficiency of doubly robust estimators of the average causal effect (ACE) of a treatment can be improved by including in the treatment and outcome models only those covariates which are related to both treatment and outcome (i.e., confounders) or related only to the outcome. However, it is often challenging to identify such covariates among the large number that may be measured in a given study. In this paper, we propose GLiDeR (Group Lasso and Doubly Robust Estimation), a novel variable selection technique for identifying confounders and predictors of outcome using an adaptive group lasso approach that simultaneously performs coefficient selection, regularization, and estimation across the treatment and outcome models. The selected variables and corresponding coefficient estimates are used in a standard doubly robust ACE estimator. We provide asymptotic results showing that, for a broad class of data generating mechanisms, GLiDeR yields a consistent estimator of the ACE when either the outcome or treatment model is correctly specified. A comprehensive simulation study shows that GLiDeR is more efficient than doubly robust methods using standard variable selection techniques and has substantial computational advantages over a recently proposed doubly robust Bayesian model averaging method. We illustrate our method by estimating the causal treatment effect of bilateral versus single-lung transplant on forced expiratory volume in one year after transplant using an observational registry. PMID:28636276
Covariate selection with group lasso and doubly robust estimation of causal effects.
Koch, Brandon; Vock, David M; Wolfson, Julian
2018-03-01
The efficiency of doubly robust estimators of the average causal effect (ACE) of a treatment can be improved by including in the treatment and outcome models only those covariates which are related to both treatment and outcome (i.e., confounders) or related only to the outcome. However, it is often challenging to identify such covariates among the large number that may be measured in a given study. In this article, we propose GLiDeR (Group Lasso and Doubly Robust Estimation), a novel variable selection technique for identifying confounders and predictors of outcome using an adaptive group lasso approach that simultaneously performs coefficient selection, regularization, and estimation across the treatment and outcome models. The selected variables and corresponding coefficient estimates are used in a standard doubly robust ACE estimator. We provide asymptotic results showing that, for a broad class of data generating mechanisms, GLiDeR yields a consistent estimator of the ACE when either the outcome or treatment model is correctly specified. A comprehensive simulation study shows that GLiDeR is more efficient than doubly robust methods using standard variable selection techniques and has substantial computational advantages over a recently proposed doubly robust Bayesian model averaging method. We illustrate our method by estimating the causal treatment effect of bilateral versus single-lung transplant on forced expiratory volume in one year after transplant using an observational registry. © 2017, The International Biometric Society.
Using Robust Variance Estimation to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan
2013-01-01
The purpose of this study was to explore the use of robust variance estimation for combining commonly specified multiple regression models and for combining sample-dependent focal slope estimates from diversely specified models. The proposed estimator obviates traditionally required information about the covariance structure of the dependent…
Technical note: Design flood under hydrological uncertainty
NASA Astrophysics Data System (ADS)
Botto, Anna; Ganora, Daniele; Claps, Pierluigi; Laio, Francesco
2017-07-01
Planning and verification of hydraulic infrastructures require a design estimate of hydrologic variables, usually provided by frequency analysis, and neglecting hydrologic uncertainty. However, when hydrologic uncertainty is accounted for, the design flood value for a specific return period is no longer a unique value, but is represented by a distribution of values. As a consequence, the design flood is no longer univocally defined, making the design process undetermined. The Uncertainty Compliant Design Flood Estimation (UNCODE) procedure is a novel approach that, starting from a range of possible design flood estimates obtained in uncertain conditions, converges to a single design value. This is obtained through a cost-benefit criterion with additional constraints that is numerically solved in a simulation framework. This paper contributes to promoting a practical use of the UNCODE procedure without resorting to numerical computation. A modified procedure is proposed by using a correction coefficient that modifies the standard (i.e., uncertainty-free) design value on the basis of sample length and return period only. The procedure is robust and parsimonious, as it does not require additional parameters with respect to the traditional uncertainty-free analysis. Simple equations to compute the correction term are provided for a number of probability distributions commonly used to represent the flood frequency curve. The UNCODE procedure, when coupled with this simple correction factor, provides a robust way to manage the hydrologic uncertainty and to go beyond the use of traditional safety factors. With all the other parameters being equal, an increase in the sample length reduces the correction factor, and thus the construction costs, while still keeping the same safety level.
Quantile rank maps: a new tool for understanding individual brain development.
Chen, Huaihou; Kelly, Clare; Castellanos, F Xavier; He, Ye; Zuo, Xi-Nian; Reiss, Philip T
2015-05-01
We propose a novel method for neurodevelopmental brain mapping that displays how an individual's values for a quantity of interest compare with age-specific norms. By estimating smoothly age-varying distributions at a set of brain regions of interest, we derive age-dependent region-wise quantile ranks for a given individual, which can be presented in the form of a brain map. Such quantile rank maps could potentially be used for clinical screening. Bootstrap-based confidence intervals are proposed for the quantile rank estimates. We also propose a recalibrated Kolmogorov-Smirnov test for detecting group differences in the age-varying distribution. This test is shown to be more robust to model misspecification than a linear regression-based test. The proposed methods are applied to brain imaging data from the Nathan Kline Institute Rockland Sample and from the Autism Brain Imaging Data Exchange (ABIDE) sample. Copyright © 2015 Elsevier Inc. All rights reserved.
Moderation analysis with missing data in the predictors.
Zhang, Qian; Wang, Lijuan
2017-12-01
The most widely used statistical model for conducting moderation analysis is the moderated multiple regression (MMR) model. In MMR modeling, missing data could pose a challenge, mainly because the interaction term is a product of two or more variables and thus is a nonlinear function of the involved variables. In this study, we consider a simple MMR model, where the effect of the focal predictor X on the outcome Y is moderated by a moderator U. The primary interest is to find ways of estimating and testing the moderation effect with the existence of missing data in X. We mainly focus on cases when X is missing completely at random (MCAR) and missing at random (MAR). Three methods are compared: (a) Normal-distribution-based maximum likelihood estimation (NML); (b) Normal-distribution-based multiple imputation (NMI); and (c) Bayesian estimation (BE). Via simulations, we found that NML and NMI could lead to biased estimates of moderation effects under MAR missingness mechanism. The BE method outperformed NMI and NML for MMR modeling with missing data in the focal predictor, missingness depending on the moderator and/or auxiliary variables, and correctly specified distributions for the focal predictor. In addition, more robust BE methods are needed in terms of the distribution mis-specification problem of the focal predictor. An empirical example was used to illustrate the applications of the methods with a simple sensitivity analysis. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Erazo, Kalil; Nagarajaiah, Satish
2017-06-01
In this paper an offline approach for output-only Bayesian identification of stochastic nonlinear systems is presented. The approach is based on a re-parameterization of the joint posterior distribution of the parameters that define a postulated state-space stochastic model class. In the re-parameterization the state predictive distribution is included, marginalized, and estimated recursively in a state estimation step using an unscented Kalman filter, bypassing state augmentation as required by existing online methods. In applications expectations of functions of the parameters are of interest, which requires the evaluation of potentially high-dimensional integrals; Markov chain Monte Carlo is adopted to sample the posterior distribution and estimate the expectations. The proposed approach is suitable for nonlinear systems subjected to non-stationary inputs whose realization is unknown, and that are modeled as stochastic processes. Numerical verification and experimental validation examples illustrate the effectiveness and advantages of the approach, including: (i) an increased numerical stability with respect to augmented-state unscented Kalman filtering, avoiding divergence of the estimates when the forcing input is unmeasured; (ii) the ability to handle arbitrary prior and posterior distributions. The experimental validation of the approach is conducted using data from a large-scale structure tested on a shake table. It is shown that the approach is robust to inherent modeling errors in the description of the system and forcing input, providing accurate prediction of the dynamic response when the excitation history is unknown.
Graphical Evaluation of the Ridge-Type Robust Regression Estimators in Mixture Experiments
Erkoc, Ali; Emiroglu, Esra
2014-01-01
In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS). However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set. PMID:25202738
Graphical evaluation of the ridge-type robust regression estimators in mixture experiments.
Erkoc, Ali; Emiroglu, Esra; Akay, Kadri Ulas
2014-01-01
In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS). However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set.
A wavelet-based statistical analysis of FMRI data: I. motivation and data distribution modeling.
Dinov, Ivo D; Boscardin, John W; Mega, Michael S; Sowell, Elizabeth L; Toga, Arthur W
2005-01-01
We propose a new method for statistical analysis of functional magnetic resonance imaging (fMRI) data. The discrete wavelet transformation is employed as a tool for efficient and robust signal representation. We use structural magnetic resonance imaging (MRI) and fMRI to empirically estimate the distribution of the wavelet coefficients of the data both across individuals and spatial locations. An anatomical subvolume probabilistic atlas is used to tessellate the structural and functional signals into smaller regions each of which is processed separately. A frequency-adaptive wavelet shrinkage scheme is employed to obtain essentially optimal estimations of the signals in the wavelet space. The empirical distributions of the signals on all the regions are computed in a compressed wavelet space. These are modeled by heavy-tail distributions because their histograms exhibit slower tail decay than the Gaussian. We discovered that the Cauchy, Bessel K Forms, and Pareto distributions provide the most accurate asymptotic models for the distribution of the wavelet coefficients of the data. Finally, we propose a new model for statistical analysis of functional MRI data using this atlas-based wavelet space representation. In the second part of our investigation, we will apply this technique to analyze a large fMRI dataset involving repeated presentation of sensory-motor response stimuli in young, elderly, and demented subjects.
Bean, William T.; Stafford, Robert; Butterfield, H. Scott; Brashares, Justin S.
2014-01-01
Species distributions are known to be limited by biotic and abiotic factors at multiple temporal and spatial scales. Species distribution models, however, frequently assume a population at equilibrium in both time and space. Studies of habitat selection have repeatedly shown the difficulty of estimating resource selection if the scale or extent of analysis is incorrect. Here, we present a multi-step approach to estimate the realized and potential distribution of the endangered giant kangaroo rat. First, we estimate the potential distribution by modeling suitability at a range-wide scale using static bioclimatic variables. We then examine annual changes in extent at a population-level. We define “available” habitat based on the total suitable potential distribution at the range-wide scale. Then, within the available habitat, model changes in population extent driven by multiple measures of resource availability. By modeling distributions for a population with robust estimates of population extent through time, and ecologically relevant predictor variables, we improved the predictive ability of SDMs, as well as revealed an unanticipated relationship between population extent and precipitation at multiple scales. At a range-wide scale, the best model indicated the giant kangaroo rat was limited to areas that received little to no precipitation in the summer months. In contrast, the best model for shorter time scales showed a positive relation with resource abundance, driven by precipitation, in the current and previous year. These results suggest that the distribution of the giant kangaroo rat was limited to the wettest parts of the drier areas within the study region. This multi-step approach reinforces the differing relationship species may have with environmental variables at different scales, provides a novel method for defining “available” habitat in habitat selection studies, and suggests a way to create distribution models at spatial and temporal scales relevant to theoretical and applied ecologists. PMID:25237807
A modified weighted function method for parameter estimation of Pearson type three distribution
NASA Astrophysics Data System (ADS)
Liang, Zhongmin; Hu, Yiming; Li, Binquan; Yu, Zhongbo
2014-04-01
In this paper, an unconventional method called Modified Weighted Function (MWF) is presented for the conventional moment estimation of a probability distribution function. The aim of MWF is to estimate the coefficient of variation (CV) and coefficient of skewness (CS) from the original higher moment computations to the first-order moment calculations. The estimators for CV and CS of Pearson type three distribution function (PE3) were derived by weighting the moments of the distribution with two weight functions, which were constructed by combining two negative exponential-type functions. The selection of these weight functions was based on two considerations: (1) to relate weight functions to sample size in order to reflect the relationship between the quantity of sample information and the role of weight function and (2) to allocate more weights to data close to medium-tail positions in a sample series ranked in an ascending order. A Monte-Carlo experiment was conducted to simulate a large number of samples upon which statistical properties of MWF were investigated. For the PE3 parent distribution, results of MWF were compared to those of the original Weighted Function (WF) and Linear Moments (L-M). The results indicate that MWF was superior to WF and slightly better than L-M, in terms of statistical unbiasness and effectiveness. In addition, the robustness of MWF, WF, and L-M were compared by designing the Monte-Carlo experiment that samples are obtained from Log-Pearson type three distribution (LPE3), three parameter Log-Normal distribution (LN3), and Generalized Extreme Value distribution (GEV), respectively, but all used as samples from the PE3 distribution. The results show that in terms of statistical unbiasness, no one method possesses the absolutely overwhelming advantage among MWF, WF, and L-M, while in terms of statistical effectiveness, the MWF is superior to WF and L-M.
Estimating the effectiveness of further sampling in species inventories
Keating, K.A.; Quinn, J.F.; Ivie, M.A.; Ivie, L.L.
1998-01-01
Estimators of the number of additional species expected in the next ??n samples offer a potentially important tool for improving cost-effectiveness of species inventories but are largely untested. We used Monte Carlo methods to compare 11 such estimators, across a range of community structures and sampling regimes, and validated our results, where possible, using empirical data from vascular plant and beetle inventories from Glacier National Park, Montana, USA. We found that B. Efron and R. Thisted's 1976 negative binomial estimator was most robust to differences in community structure and that it was among the most accurate estimators when sampling was from model communities with structures resembling the large, heterogeneous communities that are the likely targets of major inventory efforts. Other estimators may be preferred under specific conditions, however. For example, when sampling was from model communities with highly even species-abundance distributions, estimates based on the Michaelis-Menten model were most accurate; when sampling was from moderately even model communities with S=10 species or communities with highly uneven species-abundance distributions, estimates based on Gleason's (1922) species-area model were most accurate. We suggest that use of such methods in species inventories can help improve cost-effectiveness by providing an objective basis for redirecting sampling to more-productive sites, methods, or time periods as the expectation of detecting additional species becomes unacceptably low.
Is Coefficient Alpha Robust to Non-Normal Data?
Sheng, Yanyan; Sheng, Zhaohui
2011-01-01
Coefficient alpha has been a widely used measure by which internal consistency reliability is assessed. In addition to essential tau-equivalence and uncorrelated errors, normality has been noted as another important assumption for alpha. Earlier work on evaluating this assumption considered either exclusively non-normal error score distributions, or limited conditions. In view of this and the availability of advanced methods for generating univariate non-normal data, Monte Carlo simulations were conducted to show that non-normal distributions for true or error scores do create problems for using alpha to estimate the internal consistency reliability. The sample coefficient alpha is affected by leptokurtic true score distributions, or skewed and/or kurtotic error score distributions. Increased sample sizes, not test lengths, help improve the accuracy, bias, or precision of using it with non-normal data. PMID:22363306
On robust parameter estimation in brain-computer interfacing
NASA Astrophysics Data System (ADS)
Samek, Wojciech; Nakajima, Shinichi; Kawanabe, Motoaki; Müller, Klaus-Robert
2017-12-01
Objective. The reliable estimation of parameters such as mean or covariance matrix from noisy and high-dimensional observations is a prerequisite for successful application of signal processing and machine learning algorithms in brain-computer interfacing (BCI). This challenging task becomes significantly more difficult if the data set contains outliers, e.g. due to subject movements, eye blinks or loose electrodes, as they may heavily bias the estimation and the subsequent statistical analysis. Although various robust estimators have been developed to tackle the outlier problem, they ignore important structural information in the data and thus may not be optimal. Typical structural elements in BCI data are the trials consisting of a few hundred EEG samples and indicating the start and end of a task. Approach. This work discusses the parameter estimation problem in BCI and introduces a novel hierarchical view on robustness which naturally comprises different types of outlierness occurring in structured data. Furthermore, the class of minimum divergence estimators is reviewed and a robust mean and covariance estimator for structured data is derived and evaluated with simulations and on a benchmark data set. Main results. The results show that state-of-the-art BCI algorithms benefit from robustly estimated parameters. Significance. Since parameter estimation is an integral part of various machine learning algorithms, the presented techniques are applicable to many problems beyond BCI.
Adaptive torque estimation of robot joint with harmonic drive transmission
NASA Astrophysics Data System (ADS)
Shi, Zhiguo; Li, Yuankai; Liu, Guangjun
2017-11-01
Robot joint torque estimation using input and output position measurements is a promising technique, but the result may be affected by the load variation of the joint. In this paper, a torque estimation method with adaptive robustness and optimality adjustment according to load variation is proposed for robot joint with harmonic drive transmission. Based on a harmonic drive model and a redundant adaptive robust Kalman filter (RARKF), the proposed approach can adapt torque estimation filtering optimality and robustness to the load variation by self-tuning the filtering gain and self-switching the filtering mode between optimal and robust. The redundant factor of RARKF is designed as a function of the motor current for tolerating the modeling error and load-dependent filtering mode switching. The proposed joint torque estimation method has been experimentally studied in comparison with a commercial torque sensor and two representative filtering methods. The results have demonstrated the effectiveness of the proposed torque estimation technique.
Robust estimation for ordinary differential equation models.
Cao, J; Wang, L; Xu, J
2011-12-01
Applied scientists often like to use ordinary differential equations (ODEs) to model complex dynamic processes that arise in biology, engineering, medicine, and many other areas. It is interesting but challenging to estimate ODE parameters from noisy data, especially when the data have some outliers. We propose a robust method to address this problem. The dynamic process is represented with a nonparametric function, which is a linear combination of basis functions. The nonparametric function is estimated by a robust penalized smoothing method. The penalty term is defined with the parametric ODE model, which controls the roughness of the nonparametric function and maintains the fidelity of the nonparametric function to the ODE model. The basis coefficients and ODE parameters are estimated in two nested levels of optimization. The coefficient estimates are treated as an implicit function of ODE parameters, which enables one to derive the analytic gradients for optimization using the implicit function theorem. Simulation studies show that the robust method gives satisfactory estimates for the ODE parameters from noisy data with outliers. The robust method is demonstrated by estimating a predator-prey ODE model from real ecological data. © 2011, The International Biometric Society.
Liu, W; Mohan, R
2012-06-01
Proton dose distributions, IMPT in particular, are highly sensitive to setup and range uncertainties. We report a novel method, based on per-voxel standard deviation (SD) of dose distributions, to evaluate the robustness of proton plans and to robustly optimize IMPT plans to render them less sensitive to uncertainties. For each optimization iteration, nine dose distributions are computed - the nominal one, and one each for ± setup uncertainties along x, y and z axes and for ± range uncertainty. SD of dose in each voxel is used to create SD-volume histogram (SVH) for each structure. SVH may be considered a quantitative representation of the robustness of the dose distribution. For optimization, the desired robustness may be specified in terms of an SD-volume (SV) constraint on the CTV and incorporated as a term in the objective function. Results of optimization with and without this constraint were compared in terms of plan optimality and robustness using the so called'worst case' dose distributions; which are obtained by assigning the lowest among the nine doses to each voxel in the clinical target volume (CTV) and the highest to normal tissue voxels outside the CTV. The SVH curve and the area under it for each structure were used as quantitative measures of robustness. Penalty parameter of SV constraint may be varied to control the tradeoff between robustness and plan optimality. We applied these methods to one case each of H&N and lung. In both cases, we found that imposing SV constraint improved plan robustness but at the cost of normal tissue sparing. SVH-based optimization and evaluation is an effective tool for robustness evaluation and robust optimization of IMPT plans. Studies need to be conducted to test the methods for larger cohorts of patients and for other sites. This research is supported by National Cancer Institute (NCI) grant P01CA021239, the University Cancer Foundation via the Institutional Research Grant program at the University of Texas MD Anderson Cancer Center, and MD Anderson’s cancer center support grant CA016672. © 2012 American Association of Physicists in Medicine.
A decentralized mechanism for improving the functional robustness of distribution networks.
Shi, Benyun; Liu, Jiming
2012-10-01
Most real-world distribution systems can be modeled as distribution networks, where a commodity can flow from source nodes to sink nodes through junction nodes. One of the fundamental characteristics of distribution networks is the functional robustness, which reflects the ability of maintaining its function in the face of internal or external disruptions. In view of the fact that most distribution networks do not have any centralized control mechanisms, we consider the problem of how to improve the functional robustness in a decentralized way. To achieve this goal, we study two important problems: 1) how to formally measure the functional robustness, and 2) how to improve the functional robustness of a network based on the local interaction of its nodes. First, we derive a utility function in terms of network entropy to characterize the functional robustness of a distribution network. Second, we propose a decentralized network pricing mechanism, where each node need only communicate with its distribution neighbors by sending a "price" signal to its upstream neighbors and receiving "price" signals from its downstream neighbors. By doing so, each node can determine its outflows by maximizing its own payoff function. Our mathematical analysis shows that the decentralized pricing mechanism can produce results equivalent to those of an ideal centralized maximization with complete information. Finally, to demonstrate the properties of our mechanism, we carry out a case study on the U.S. natural gas distribution network. The results validate the convergence and effectiveness of our mechanism when comparing it with an existing algorithm.
Kery, M.; Royle, J. Andrew; Schmid, Hans; Schaub, M.; Volet, B.; Hafliger, G.; Zbinden, N.
2010-01-01
Species' assessments must frequently be derived from opportunistic observations made by volunteers (i.e., citizen scientists). Interpretation of the resulting data to estimate population trends is plagued with problems, including teasing apart genuine population trends from variations in observation effort. We devised a way to correct for annual variation in effort when estimating trends in occupancy (species distribution) from faunal or floral databases of opportunistic observations. First, for all surveyed sites, detection histories (i.e., strings of detection-nondetection records) are generated. Within-season replicate surveys provide information on the detectability of an occupied site. Detectability directly represents observation effort; hence, estimating detectablity means correcting for observation effort. Second, site-occupancy models are applied directly to the detection-history data set (i.e., without aggregation by site and year) to estimate detectability and species distribution (occupancy, i.e., the true proportion of sites where a species occurs). Site-occupancy models also provide unbiased estimators of components of distributional change (i.e., colonization and extinction rates). We illustrate our method with data from a large citizen-science project in Switzerland in which field ornithologists record opportunistic observations. We analyzed data collected on four species: the widespread Kingfisher (Alcedo atthis. ) and Sparrowhawk (Accipiter nisus. ) and the scarce Rock Thrush (Monticola saxatilis. ) and Wallcreeper (Tichodroma muraria. ). Our method requires that all observed species are recorded. Detectability was <1 and varied over the years. Simulations suggested some robustness, but we advocate recording complete species lists (checklists), rather than recording individual records of single species. The representation of observation effort with its effect on detectability provides a solution to the problem of differences in effort encountered when extracting trend information from haphazard observations. We expect our method is widely applicable for global biodiversity monitoring and modeling of species distributions. ?? 2010 Society for Conservation Biology.
NASA Astrophysics Data System (ADS)
Manolakis, Dimitris G.
2004-10-01
The linear mixing model is widely used in hyperspectral imaging applications to model the reflectance spectra of mixed pixels in the SWIR atmospheric window or the radiance spectra of plume gases in the LWIR atmospheric window. In both cases it is important to detect the presence of materials or gases and then estimate their amount, if they are present. The detection and estimation algorithms available for these tasks are related but they are not identical. The objective of this paper is to theoretically investigate how the heavy tails observed in hyperspectral background data affect the quality of abundance estimates and how the F-test, used for endmember selection, is robust to the presence of heavy tails when the model fits the data.
Evaluation of the Performance of the Distributed Phased-MIMO Sonar.
Pan, Xiang; Jiang, Jingning; Wang, Nan
2017-01-11
A broadband signal model is proposed for a distributed multiple-input multiple-output (MIMO) sonar system consisting of two transmitters and a receiving linear array. Transmitters are widely separated to illuminate the different aspects of an extended target of interest. The beamforming technique is utilized at the reception ends for enhancement of weak target echoes. A MIMO detector is designed with the estimated target position parameters within the general likelihood rate test (GLRT) framework. For the high signal-to-noise ratio case, the detection performance of the MIMO system is better than that of the phased-array system in the numerical simulations and the tank experiments. The robustness of the distributed phased-MIMO sonar system is further demonstrated in localization of a target in at-lake experiments.
Evaluation of the Performance of the Distributed Phased-MIMO Sonar
Pan, Xiang; Jiang, Jingning; Wang, Nan
2017-01-01
A broadband signal model is proposed for a distributed multiple-input multiple-output (MIMO) sonar system consisting of two transmitters and a receiving linear array. Transmitters are widely separated to illuminate the different aspects of an extended target of interest. The beamforming technique is utilized at the reception ends for enhancement of weak target echoes. A MIMO detector is designed with the estimated target position parameters within the general likelihood rate test (GLRT) framework. For the high signal-to-noise ratio case, the detection performance of the MIMO system is better than that of the phased-array system in the numerical simulations and the tank experiments. The robustness of the distributed phased-MIMO sonar system is further demonstrated in localization of a target in at-lake experiments. PMID:28085071
Integrated direct/indirect adaptive robust motion trajectory tracking control of pneumatic cylinders
NASA Astrophysics Data System (ADS)
Meng, Deyuan; Tao, Guoliang; Zhu, Xiaocong
2013-09-01
This paper studies the precision motion trajectory tracking control of a pneumatic cylinder driven by a proportional-directional control valve. An integrated direct/indirect adaptive robust controller is proposed. The controller employs a physical model based indirect-type parameter estimation to obtain reliable estimates of unknown model parameters, and utilises a robust control method with dynamic compensation type fast adaptation to attenuate the effects of parameter estimation errors, unmodelled dynamics and disturbances. Due to the use of projection mapping, the robust control law and the parameter adaption algorithm can be designed separately. Since the system model uncertainties are unmatched, the recursive backstepping technology is adopted to design the robust control law. Extensive comparative experimental results are presented to illustrate the effectiveness of the proposed controller and its performance robustness to parameter variations and sudden disturbances.
The effectiveness of robust RMCD control chart as outliers’ detector
NASA Astrophysics Data System (ADS)
Darmanto; Astutik, Suci
2017-12-01
A well-known control chart to monitor a multivariate process is Hotelling’s T 2 which its parameters are estimated classically, very sensitive and also marred by masking and swamping of outliers data effect. To overcome these situation, robust estimators are strongly recommended. One of robust estimators is re-weighted minimum covariance determinant (RMCD) which has robust characteristics as same as MCD. In this paper, the effectiveness term is accuracy of the RMCD control chart in detecting outliers as real outliers. In other word, how effectively this control chart can identify and remove masking and swamping effects of outliers. We assessed the effectiveness the robust control chart based on simulation by considering different scenarios: n sample sizes, proportion of outliers, number of p quality characteristics. We found that in some scenarios, this RMCD robust control chart works effectively.
Ergon, T.; Yoccoz, N.G.; Nichols, J.D.; Thomson, David L.; Cooch, Evan G.; Conroy, Michael J.
2009-01-01
In many species, age or time of maturation and survival costs of reproduction may vary substantially within and among populations. We present a capture-mark-recapture model to estimate the latent individual trait distribution of time of maturation (or other irreversible transitions) as well as survival differences associated with the two states (representing costs of reproduction). Maturation can take place at any point in continuous time, and mortality hazard rates for each reproductive state may vary according to continuous functions over time. Although we explicitly model individual heterogeneity in age/time of maturation, we make the simplifying assumption that death hazard rates do not vary among individuals within groups of animals. However, the estimates of the maturation distribution are fairly robust against individual heterogeneity in survival as long as there is no individual level correlation between mortality hazards and latent time of maturation. We apply the model to biweekly capture?recapture data of overwintering field voles (Microtus agrestis) in cyclically fluctuating populations to estimate time of maturation and survival costs of reproduction. Results show that onset of seasonal reproduction is particularly late and survival costs of reproduction are particularly large in declining populations.
Diffusion MRI noise mapping using random matrix theory
Veraart, Jelle; Fieremans, Els; Novikov, Dmitry S.
2016-01-01
Purpose To estimate the spatially varying noise map using a redundant magnitude MR series. Methods We exploit redundancy in non-Gaussian multi-directional diffusion MRI data by identifying its noise-only principal components, based on the theory of noisy covariance matrices. The bulk of PCA eigenvalues, arising due to noise, is described by the universal Marchenko-Pastur distribution, parameterized by the noise level. This allows us to estimate noise level in a local neighborhood based on the singular value decomposition of a matrix combining neighborhood voxels and diffusion directions. Results We present a model-independent local noise mapping method capable of estimating noise level down to about 1% error. In contrast to current state-of-the art techniques, the resultant noise maps do not show artifactual anatomical features that often reflect physiological noise, the presence of sharp edges, or a lack of adequate a priori knowledge of the expected form of MR signal. Conclusions Simulations and experiments show that typical diffusion MRI data exhibit sufficient redundancy that enables accurate, precise, and robust estimation of the local noise level by interpreting the PCA eigenspectrum in terms of the Marchenko-Pastur distribution. PMID:26599599
A Robust Bayesian Approach for Structural Equation Models with Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum; Xia, Ye-Mao
2008-01-01
In this paper, normal/independent distributions, including but not limited to the multivariate t distribution, the multivariate contaminated distribution, and the multivariate slash distribution, are used to develop a robust Bayesian approach for analyzing structural equation models with complete or missing data. In the context of a nonlinear…
NASA Astrophysics Data System (ADS)
Juesas, P.; Ramasso, E.
2016-12-01
Condition monitoring aims at ensuring system safety which is a fundamental requirement for industrial applications and that has become an inescapable social demand. This objective is attained by instrumenting the system and developing data analytics methods such as statistical models able to turn data into relevant knowledge. One difficulty is to be able to correctly estimate the parameters of those methods based on time-series data. This paper suggests the use of the Weighted Distribution Theory together with the Expectation-Maximization algorithm to improve parameter estimation in statistical models with latent variables with an application to health monotonic under uncertainty. The improvement of estimates is made possible by incorporating uncertain and possibly noisy prior knowledge on latent variables in a sound manner. The latent variables are exploited to build a degradation model of dynamical system represented as a sequence of discrete states. Examples on Gaussian Mixture Models, Hidden Markov Models (HMM) with discrete and continuous outputs are presented on both simulated data and benchmarks using the turbofan engine datasets. A focus on the application of a discrete HMM to health monitoring under uncertainty allows to emphasize the interest of the proposed approach in presence of different operating conditions and fault modes. It is shown that the proposed model depicts high robustness in presence of noisy and uncertain prior.
Laber, Eric B; Zhao, Ying-Qi; Regh, Todd; Davidian, Marie; Tsiatis, Anastasios; Stanford, Joseph B; Zeng, Donglin; Song, Rui; Kosorok, Michael R
2016-04-15
A personalized treatment strategy formalizes evidence-based treatment selection by mapping patient information to a recommended treatment. Personalized treatment strategies can produce better patient outcomes while reducing cost and treatment burden. Thus, among clinical and intervention scientists, there is a growing interest in conducting randomized clinical trials when one of the primary aims is estimation of a personalized treatment strategy. However, at present, there are no appropriate sample size formulae to assist in the design of such a trial. Furthermore, because the sampling distribution of the estimated outcome under an estimated optimal treatment strategy can be highly sensitive to small perturbations in the underlying generative model, sample size calculations based on standard (uncorrected) asymptotic approximations or computer simulations may not be reliable. We offer a simple and robust method for powering a single stage, two-armed randomized clinical trial when the primary aim is estimating the optimal single stage personalized treatment strategy. The proposed method is based on inverting a plugin projection confidence interval and is thereby regular and robust to small perturbations of the underlying generative model. The proposed method requires elicitation of two clinically meaningful parameters from clinical scientists and uses data from a small pilot study to estimate nuisance parameters, which are not easily elicited. The method performs well in simulated experiments and is illustrated using data from a pilot study of time to conception and fertility awareness. Copyright © 2015 John Wiley & Sons, Ltd.
Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood
Bondell, Howard D.; Stefanski, Leonard A.
2013-01-01
Large- and finite-sample efficiency and resistance to outliers are the key goals of robust statistics. Although often not simultaneously attainable, we develop and study a linear regression estimator that comes close. Efficiency obtains from the estimator’s close connection to generalized empirical likelihood, and its favorable robustness properties are obtained by constraining the associated sum of (weighted) squared residuals. We prove maximum attainable finite-sample replacement breakdown point, and full asymptotic efficiency for normal errors. Simulation evidence shows that compared to existing robust regression estimators, the new estimator has relatively high efficiency for small sample sizes, and comparable outlier resistance. The estimator is further illustrated and compared to existing methods via application to a real data set with purported outliers. PMID:23976805
Improving the realism of hydrologic model through multivariate parameter estimation
NASA Astrophysics Data System (ADS)
Rakovec, Oldrich; Kumar, Rohini; Attinger, Sabine; Samaniego, Luis
2017-04-01
Increased availability and quality of near real-time observations should improve understanding of predictive skills of hydrological models. Recent studies have shown the limited capability of river discharge data alone to adequately constrain different components of distributed model parameterizations. In this study, the GRACE satellite-based total water storage (TWS) anomaly is used to complement the discharge data with an aim to improve the fidelity of mesoscale hydrologic model (mHM) through multivariate parameter estimation. The study is conducted in 83 European basins covering a wide range of hydro-climatic regimes. The model parameterization complemented with the TWS anomalies leads to statistically significant improvements in (1) discharge simulations during low-flow period, and (2) evapotranspiration estimates which are evaluated against independent (FLUXNET) data. Overall, there is no significant deterioration in model performance for the discharge simulations when complemented by information from the TWS anomalies. However, considerable changes in the partitioning of precipitation into runoff components are noticed by in-/exclusion of TWS during the parameter estimation. A cross-validation test carried out to assess the transferability and robustness of the calibrated parameters to other locations further confirms the benefit of complementary TWS data. In particular, the evapotranspiration estimates show more robust performance when TWS data are incorporated during the parameter estimation, in comparison with the benchmark model constrained against discharge only. This study highlights the value for incorporating multiple data sources during parameter estimation to improve the overall realism of hydrologic model and its applications over large domains. Rakovec, O., Kumar, R., Attinger, S. and Samaniego, L. (2016): Improving the realism of hydrologic model functioning through multivariate parameter estimation. Water Resour. Res., 52, http://dx.doi.org/10.1002/2016WR019430
Optimal External Wrench Distribution During a Multi-Contact Sit-to-Stand Task.
Bonnet, Vincent; Azevedo-Coste, Christine; Robert, Thomas; Fraisse, Philippe; Venture, Gentiane
2017-07-01
This paper aims at developing and evaluating a new practical method for the real-time estimate of joint torques and external wrenches during multi-contact sit-to-stand (STS) task using kinematics data only. The proposed method allows also identifying subject specific body inertial segment parameters that are required to perform inverse dynamics. The identification phase is performed using simple and repeatable motions. Thanks to an accurately identified model the estimate of the total external wrench can be used as an input to solve an under-determined multi-contact problem. It is solved using a constrained quadratic optimization process minimizing a hybrid human-like energetic criterion. The weights of this hybrid cost function are adjusted and a sensitivity analysis is performed in order to reproduce robustly human external wrench distribution. The results showed that the proposed method could successfully estimate the external wrenches under buttocks, feet, and hands during STS tasks (RMS error lower than 20 N and 6 N.m). The simplicity and generalization abilities of the proposed method allow paving the way of future diagnosis solutions and rehabilitation applications, including in-home use.
Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L
Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologicmore » scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.« less
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.; ...
2017-08-17
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
Dziak, John J.; Bray, Bethany C.; Zhang, Jieting; Zhang, Minqiang; Lanza, Stephanie T.
2016-01-01
Several approaches are available for estimating the relationship of latent class membership to distal outcomes in latent profile analysis (LPA). A three-step approach is commonly used, but has problems with estimation bias and confidence interval coverage. Proposed improvements include the correction method of Bolck, Croon, and Hagenaars (BCH; 2004), Vermunt’s (2010) maximum likelihood (ML) approach, and the inclusive three-step approach of Bray, Lanza, & Tan (2015). These methods have been studied in the related case of latent class analysis (LCA) with categorical indicators, but not as well studied for LPA with continuous indicators. We investigated the performance of these approaches in LPA with normally distributed indicators, under different conditions of distal outcome distribution, class measurement quality, relative latent class size, and strength of association between latent class and the distal outcome. The modified BCH implemented in Latent GOLD had excellent performance. The maximum likelihood and inclusive approaches were not robust to violations of distributional assumptions. These findings broadly agree with and extend the results presented by Bakk and Vermunt (2016) in the context of LCA with categorical indicators. PMID:28630602
Phase unwrapping algorithm using polynomial phase approximation and linear Kalman filter.
Kulkarni, Rishikesh; Rastogi, Pramod
2018-02-01
A noise-robust phase unwrapping algorithm is proposed based on state space analysis and polynomial phase approximation using wrapped phase measurement. The true phase is approximated as a two-dimensional first order polynomial function within a small sized window around each pixel. The estimates of polynomial coefficients provide the measurement of phase and local fringe frequencies. A state space representation of spatial phase evolution and the wrapped phase measurement is considered with the state vector consisting of polynomial coefficients as its elements. Instead of using the traditional nonlinear Kalman filter for the purpose of state estimation, we propose to use the linear Kalman filter operating directly with the wrapped phase measurement. The adaptive window width is selected at each pixel based on the local fringe density to strike a balance between the computation time and the noise robustness. In order to retrieve the unwrapped phase, either a line-scanning approach or a quality guided strategy of pixel selection is used depending on the underlying continuous or discontinuous phase distribution, respectively. Simulation and experimental results are provided to demonstrate the applicability of the proposed method.
NASA Astrophysics Data System (ADS)
Zhang, Jianqiao; Ye, Dong; Sun, Zhaowei; Liu, Chuang
2018-02-01
This paper presents a robust adaptive controller integrated with an extended state observer (ESO) to solve coupled spacecraft tracking maneuver in the presence of model uncertainties, external disturbances, actuator uncertainties including magnitude deviation and misalignment, and even actuator saturation. More specifically, employing the exponential coordinates on the Lie group SE(3) to describe configuration tracking errors, the coupled six-degrees-of-freedom (6-DOF) dynamics are developed for spacecraft relative motion, in which a generic fully actuated thruster distribution is considered and the lumped disturbances are reconstructed by using anti-windup technique. Then, a novel ESO, developed via second order sliding mode (SOSM) technique and adding linear correction terms to improve the performance, is designed firstly to estimate the disturbances in finite time. Based on the estimated information, an adaptive fast terminal sliding mode (AFTSM) controller is developed to guarantee the almost global asymptotic stability of the resulting closed-loop system such that the trajectory can be tracked with all the aforementioned drawbacks addressed simultaneously. Finally, the effectiveness of the controller is illustrated through numerical examples.
A Regression Design Approach to Optimal and Robust Spacing Selection.
1981-07-01
Hassanein (1968, 1969a, 1969b, 1971, 1972, 1977), Kulldorf (1963), Kulldorf and Vannman (1973), Rhodin (1976), Sarhan and Greenberg (1958, 1962) and...of d0 and Q0 1 d 0 "Q0 ’ are in the reproducing kernel Hilbert space (RKHS) generated by R, the techniques developed by Parzen (1961a, 1961b) may be... Greenberg , B.G. (1958). Estimation problems in the exponential distribution using order statistics. Proceedings of the Statistical Techniques in Missile
Joint groupwise registration and ADC estimation in the liver using a B-value weighted metric.
Sanz-Estébanez, Santiago; Rabanillo-Viloria, Iñaki; Royuela-Del-Val, Javier; Aja-Fernández, Santiago; Alberola-López, Carlos
2018-02-01
The purpose of this work is to develop a groupwise elastic multimodal registration algorithm for robust ADC estimation in the liver on multiple breath hold diffusion weighted images. We introduce a joint formulation to simultaneously solve both the registration and the estimation problems. In order to avoid non-reliable transformations and undesirable noise amplification, we have included appropriate smoothness constraints for both problems. Our metric incorporates the ADC estimation residuals, which are inversely weighted according to the signal content in each diffusion weighted image. Results show that the joint formulation provides a statistically significant improvement in the accuracy of the ADC estimates. Reproducibility has also been measured on real data in terms of the distribution of ADC differences obtained from different b-values subsets. The proposed algorithm is able to effectively deal with both the presence of motion and the geometric distortions, increasing accuracy and reproducibility in diffusion parameters estimation. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Soil water content spatial pattern estimated by thermal inertia from air-borne sensors
NASA Astrophysics Data System (ADS)
Coppola, Antonio; Basile, Angelo; Esposito, Marco; Menenti, Massimo; Buonanno, Maurizio
2010-05-01
Remote sensing of soil water content from air- or space-borne platforms offer the possibility to provide large spatial coverage and temporal continuity. The water content can be actually monitored in a thin soil layer, usually up to a depth of 0.05m below the soil surface. To the contrary, difficulties arise in the estimation of the water content storage along the soil profile and its spatial (horizontal) distribution, which are closely connected to soil hydraulic properties and their spatial distribution. A promising approach for estimating soil water contents profiles is the integration of remote sensing of surface water content and hydrological modeling. A major goal of the scientific group is to develop a practical and robust procedure for estimating water contents throughout the soil profile from surface water content. As a first step, in this work, we will show some preliminary results from aircraft images analysis and their validation by field campaigns data. The data extracted from the airborne sensors provided the opportunity of retrieving land surface temperatures with a very high spatial resolution. The surface water content pattern, as deduced by the thermal inertia estimations, was compared to the surface water contents maps measured in situ by time domain reflectometry-based probes.
Relaxation of ferroelectric states in 2D distributions of quantum dots: EELS simulation
NASA Astrophysics Data System (ADS)
Cortés, C. M.; Meza-Montes, L.; Moctezuma, R. E.; Carrillo, J. L.
2016-06-01
The relaxation time of collective electronic states in a 2D distribution of quantum dots is investigated theoretically by simulating EELS experiments. From the numerical calculation of the probability of energy loss of an electron beam, traveling parallel to the distribution, it is possible to estimate the damping time of ferroelectric-like states. We generate this collective response of the distribution by introducing a mean field interaction among the quantum dots, and then, the model is extended incorporating effects of long-range correlations through a Bragg-Williams approximation. The behavior of the dielectric function, the energy loss function, and the relaxation time of ferroelectric-like states is then investigated as a function of the temperature of the distribution and the damping constant of the electronic states in the single quantum dots. The robustness of the trends and tendencies of our results indicate that this scheme of analysis can guide experimentalists to develop tailored quantum dots distributions for specific applications.
Determination of gold nanoparticle shape from absorption spectroscopy and ellipsometry
NASA Astrophysics Data System (ADS)
Battie, Yann; Izquierdo-Lorenzo, Irene; Resano-Garcia, Amandine; Naciri, Aotmane En; Akil, Suzanna; Adam, Pierre Michel; Jradi, Safi
2017-11-01
A new methodology is developed to determine the shape distribution of gold nanoparticles (NPs) from optical spectroscopic measurements. Indeed, the morphology of Au colloids is deduced by fitting their absorption spectra with an effective medium theory which takes into account the nanoparticle shape distribution. The same procedure is applied to ellipsometric measurements recorded on photoresist films which contain Au NPs. Three spaces (L2, r2, P2) are introduced to interpret the NPs shape distribution. In the P2 space, the sphericity, the prolacity and the oblacity estimators are proposed to quantify the shape of NPs. The r2 space enables the determination of the NP aspect ratio distribution. The distributions determined from optical spectroscopy were found to be in very good agreement with the shape distributions obtained by transmission electron microscopy. We found that fitting absorption or ellipsometric spectra with an adequate effective medium theory, provides a robust tool for measuring the shape and concentration of metallic NPs.
A robust background regression based score estimation algorithm for hyperspectral anomaly detection
NASA Astrophysics Data System (ADS)
Zhao, Rui; Du, Bo; Zhang, Liangpei; Zhang, Lefei
2016-12-01
Anomaly detection has become a hot topic in the hyperspectral image analysis and processing fields in recent years. The most important issue for hyperspectral anomaly detection is the background estimation and suppression. Unreasonable or non-robust background estimation usually leads to unsatisfactory anomaly detection results. Furthermore, the inherent nonlinearity of hyperspectral images may cover up the intrinsic data structure in the anomaly detection. In order to implement robust background estimation, as well as to explore the intrinsic data structure of the hyperspectral image, we propose a robust background regression based score estimation algorithm (RBRSE) for hyperspectral anomaly detection. The Robust Background Regression (RBR) is actually a label assignment procedure which segments the hyperspectral data into a robust background dataset and a potential anomaly dataset with an intersection boundary. In the RBR, a kernel expansion technique, which explores the nonlinear structure of the hyperspectral data in a reproducing kernel Hilbert space, is utilized to formulate the data as a density feature representation. A minimum squared loss relationship is constructed between the data density feature and the corresponding assigned labels of the hyperspectral data, to formulate the foundation of the regression. Furthermore, a manifold regularization term which explores the manifold smoothness of the hyperspectral data, and a maximization term of the robust background average density, which suppresses the bias caused by the potential anomalies, are jointly appended in the RBR procedure. After this, a paired-dataset based k-nn score estimation method is undertaken on the robust background and potential anomaly datasets, to implement the detection output. The experimental results show that RBRSE achieves superior ROC curves, AUC values, and background-anomaly separation than some of the other state-of-the-art anomaly detection methods, and is easy to implement in practice.
Kaye, T.N.; Pyke, David A.
2003-01-01
Population viability analysis is an important tool for conservation biologists, and matrix models that incorporate stochasticity are commonly used for this purpose. However, stochastic simulations may require assumptions about the distribution of matrix parameters, and modelers often select a statistical distribution that seems reasonable without sufficient data to test its fit. We used data from long-term (5a??10 year) studies with 27 populations of five perennial plant species to compare seven methods of incorporating environmental stochasticity. We estimated stochastic population growth rate (a measure of viability) using a matrix-selection method, in which whole observed matrices were selected at random at each time step of the model. In addition, we drew matrix elements (transition probabilities) at random using various statistical distributions: beta, truncated-gamma, truncated-normal, triangular, uniform, or discontinuous/observed. Recruitment rates were held constant at their observed mean values. Two methods of constraining stage-specific survival to a??100% were also compared. Different methods of incorporating stochasticity and constraining matrix column sums interacted in their effects and resulted in different estimates of stochastic growth rate (differing by up to 16%). Modelers should be aware that when constraining stage-specific survival to 100%, different methods may introduce different levels of bias in transition element means, and when this happens, different distributions for generating random transition elements may result in different viability estimates. There was no species effect on the results and the growth rates derived from all methods were highly correlated with one another. We conclude that the absolute value of population viability estimates is sensitive to model assumptions, but the relative ranking of populations (and management treatments) is robust. Furthermore, these results are applicable to a range of perennial plants and possibly other life histories.
Tebaldi, Edinaldo; Mohan, Ramesh
2010-01-01
This study utilises eight alternative measures of institutions and the instrumental variable method to examine the impacts of institutions on poverty. The estimates show that an economy with a robust system to control corruption, an effective government, and a stable political system will create the conditions to promote economic growth, minimise income distribution conflicts, and reduce poverty. Corruption, ineffective governments, and political instability will not only hurt income levels through market inefficiencies, but also escalate poverty incidence via increased income inequality. The results also imply that the quality of the regulatory system, rule of law, voice and accountability, and expropriation risk are inversely related to poverty but their effect on poverty is via average income rather than income distribution.
Probability shapes perceptual precision: A study in orientation estimation.
Jabar, Syaheed B; Anderson, Britt
2015-12-01
Probability is known to affect perceptual estimations, but an understanding of mechanisms is lacking. Moving beyond binary classification tasks, we had naive participants report the orientation of briefly viewed gratings where we systematically manipulated contingent probability. Participants rapidly developed faster and more precise estimations for high-probability tilts. The shapes of their error distributions, as indexed by a kurtosis measure, also showed a distortion from Gaussian. This kurtosis metric was robust, capturing probability effects that were graded, contextual, and varying as a function of stimulus orientation. Our data can be understood as a probability-induced reduction in the variability or "shape" of estimation errors, as would be expected if probability affects the perceptual representations. As probability manipulations are an implicit component of many endogenous cuing paradigms, changes at the perceptual level could account for changes in performance that might have traditionally been ascribed to "attention." (c) 2015 APA, all rights reserved).
Rodhouse, Thomas J.; Ormsbee, Patricia C.; Irvine, Kathryn M.; Vierling, Lee A.; Szewczak, Joseph M.; Vierling, Kerri T.
2012-01-01
Despite its common status, M. lucifugus was only detected during ∼50% of the surveys in occupied sample units. The overall naïve estimate for the proportion of the study region occupied by the species was 0.69, but after accounting for imperfect detection, this increased to ∼0.90. Our models provide evidence of an association between NPP and forest cover and M. lucifugus distribution, with implications for the projected effects of accelerated climate change in the region, which include net aridification as snowpack and stream flows decline. Annual turnover, the probability that an occupied sample unit was a newly occupied one, was estimated to be low (∼0.04–0.14), resulting in flat trend estimated with relatively high precision (SD = 0.04). We mapped the variation in predicted occurrence probabilities and corresponding prediction uncertainty along the productivity gradient. Our results provide a much needed baseline against which future anticipated declines in M. lucifugus occurrence can be measured. The dynamic distribution modeling approach has broad applicability to regional bat monitoring efforts now underway in several countries and we suggest ways to improve and expand our grid-based monitoring program to gain robust insights into bat population status and trend across large portions of North America.
4D computerized ionospheric tomography by using GPS measurements and IRI-Plas model
NASA Astrophysics Data System (ADS)
Tuna, Hakan; Arikan, Feza; Arikan, Orhan
2016-07-01
Ionospheric imaging is an important subject in ionospheric studies. GPS based TEC measurements provide very accurate information about the electron density values in the ionosphere. However, since the measurements are generally very sparse and non-uniformly distributed, computation of 3D electron density estimation from measurements alone is an ill-defined problem. Model based 3D electron density estimations provide physically feasible distributions. However, they are not generally compliant with the TEC measurements obtained from GPS receivers. In this study, GPS based TEC measurements and an ionosphere model known as International Reference Ionosphere Extended to Plasmasphere (IRI-Plas) are employed together in order to obtain a physically accurate 3D electron density distribution which is compliant with the real measurements obtained from a GPS satellite - receiver network. Ionospheric parameters input to the IRI-Plas model are perturbed in the region of interest by using parametric perturbation models such that the synthetic TEC measurements calculated from the resultant 3D electron density distribution fit to the real TEC measurements. The problem is considered as an optimization problem where the optimization parameters are the parameters of the parametric perturbation models. Proposed technique is applied over Turkey, on both calm and storm days of the ionosphere. Results show that the proposed technique produces 3D electron density distributions which are compliant with IRI-Plas model, GPS TEC measurements and ionosonde measurements. The effect of the GPS receiver station number on the performance of the proposed technique is investigated. Results showed that 7 GPS receiver stations in a region as large as Turkey is sufficient for both calm and storm days of the ionosphere. Since the ionization levels in the ionosphere are highly correlated in time, the proposed technique is extended to the time domain by applying Kalman based tracking and smoothing approaches onto the obtained results. Combining Kalman methods with the proposed 3D CIT technique creates a robust 4D ionospheric electron density estimation model, and has the advantage of decreasing the computational cost of the proposed method. Results applied on both calm and storm days of the ionosphere show that, new technique produces more robust solutions especially when the number of GPS receiver stations in the region is small. This study is supported by TUBITAK 114E541, 115E915 and Joint TUBITAK 114E092 and AS CR 14/001 projects.
Robust guaranteed-cost adaptive quantum phase estimation
NASA Astrophysics Data System (ADS)
Roy, Shibdas; Berry, Dominic W.; Petersen, Ian R.; Huntington, Elanor H.
2017-05-01
Quantum parameter estimation plays a key role in many fields like quantum computation, communication, and metrology. Optimal estimation allows one to achieve the most precise parameter estimates, but requires accurate knowledge of the model. Any inevitable uncertainty in the model parameters may heavily degrade the quality of the estimate. It is therefore desired to make the estimation process robust to such uncertainties. Robust estimation was previously studied for a varying phase, where the goal was to estimate the phase at some time in the past, using the measurement results from both before and after that time within a fixed time interval up to current time. Here, we consider a robust guaranteed-cost filter yielding robust estimates of a varying phase in real time, where the current phase is estimated using only past measurements. Our filter minimizes the largest (worst-case) variance in the allowable range of the uncertain model parameter(s) and this determines its guaranteed cost. It outperforms in the worst case the optimal Kalman filter designed for the model with no uncertainty, which corresponds to the center of the possible range of the uncertain parameter(s). Moreover, unlike the Kalman filter, our filter in the worst case always performs better than the best achievable variance for heterodyne measurements, which we consider as the tolerable threshold for our system. Furthermore, we consider effective quantum efficiency and effective noise power, and show that our filter provides the best results by these measures in the worst case.
NASA Astrophysics Data System (ADS)
Tedrow, Christine Atkins
The primary goal in this study was to explore remote sensing, ecological niche modeling, and Geographic Information Systems (GIS) as aids in predicting candidate Rift Valley fever (RVF) competent vector abundance and distribution in Virginia, and as means of estimating where risk of establishment in mosquitoes and risk of transmission to human populations would be greatest in Virginia. A second goal in this study was to determine whether the remotely-sensed Normalized Difference Vegetation Index (NDVI) can be used as a proxy variable of local conditions for the development of mosquitoes to predict mosquito species distribution and abundance in Virginia. As part of this study, a mosquito surveillance database was compiled to archive the historical patterns of mosquito species abundance in Virginia. In addition, linkages between mosquito density and local environmental and climatic patterns were spatially and temporally examined. The present study affirms the potential role of remote sensing imagery for species distribution prediction, and it demonstrates that ecological niche modeling is a valuable predictive tool to analyze the distributions of populations. The MaxEnt ecological niche modeling program was used to model predicted ranges for potential RVF competent vectors in Virginia. The MaxEnt model was shown to be robust, and the candidate RVF competent vector predicted distribution map is presented. The Normalized Difference Vegetation Index (NDVI) was found to be the most useful environmental-climatic variable to predict mosquito species distribution and abundance in Virginia. However, these results indicate that a more robust prediction is obtained by including other environmental-climatic factors correlated to mosquito densities (e.g., temperature, precipitation, elevation) with NDVI. The present study demonstrates that remote sensing and GIS can be used with ecological niche and risk modeling methods to estimate risk of virus establishment in mosquitoes and transmission to humans. Maps delineating the geographic areas in Virginia with highest risk for RVF establishment in mosquito populations and RVF disease transmission to human populations were generated in a GIS using human, domestic animal, and white-tailed deer population estimates and the MaxEnt potential RVF competent vector species distribution prediction. The candidate RVF competent vector predicted distribution and RVF risk maps presented in this study can help vector control agencies and public health officials focus Rift Valley fever surveillance efforts in geographic areas with large co-located populations of potential RVF competent vectors and human, domestic animal, and wildlife hosts. Keywords. Rift Valley fever, risk assessment, Ecological Niche Modeling, MaxEnt, Geographic Information System, remote sensing, Pearson's Product-Moment Correlation Coefficient, vectors, mosquito distribution, mosquito density, mosquito surveillance, United States, Virginia, domestic animals, white-tailed deer, ArcGIS
Bacciu, Davide; Starita, Antonina
2008-11-01
Determining a compact neural coding for a set of input stimuli is an issue that encompasses several biological memory mechanisms as well as various artificial neural network models. In particular, establishing the optimal network structure is still an open problem when dealing with unsupervised learning models. In this paper, we introduce a novel learning algorithm, named competitive repetition-suppression (CoRe) learning, inspired by a cortical memory mechanism called repetition suppression (RS). We show how such a mechanism is used, at various levels of the cerebral cortex, to generate compact neural representations of the visual stimuli. From the general CoRe learning model, we derive a clustering algorithm, named CoRe clustering, that can automatically estimate the unknown cluster number from the data without using a priori information concerning the input distribution. We illustrate how CoRe clustering, besides its biological plausibility, posses strong theoretical properties in terms of robustness to noise and outliers, and we provide an error function describing CoRe learning dynamics. Such a description is used to analyze CoRe relationships with the state-of-the art clustering models and to highlight CoRe similitude with rival penalized competitive learning (RPCL), showing how CoRe extends such a model by strengthening the rival penalization estimation by means of loss functions from robust statistics.
Robust radio interferometric calibration using the t-distribution
NASA Astrophysics Data System (ADS)
Kazemi, S.; Yatawatta, S.
2013-10-01
A major stage of radio interferometric data processing is calibration or the estimation of systematic errors in the data and the correction for such errors. A stochastic error (noise) model is assumed, and in most cases, this underlying model is assumed to be Gaussian. However, outliers in the data due to interference or due to errors in the sky model would have adverse effects on processing based on a Gaussian noise model. Most of the shortcomings of calibration such as the loss in flux or coherence, and the appearance of spurious sources, could be attributed to the deviations of the underlying noise model. In this paper, we propose to improve the robustness of calibration by using a noise model based on Student's t-distribution. Student's t-noise is a special case of Gaussian noise when the variance is unknown. Unlike Gaussian-noise-model-based calibration, traditional least-squares minimization would not directly extend to a case when we have a Student's t-noise model. Therefore, we use a variant of the expectation-maximization algorithm, called the expectation-conditional maximization either algorithm, when we have a Student's t-noise model and use the Levenberg-Marquardt algorithm in the maximization step. We give simulation results to show the robustness of the proposed calibration method as opposed to traditional Gaussian-noise-model-based calibration, especially in preserving the flux of weaker sources that are not included in the calibration model.
Robinson, Hugh S.; Abarca, Maria; Zeller, Katherine A.; Velasquez, Grisel; Paemelaere, Evi A. D.; Goldberg, Joshua F.; Payan, Esteban; Hoogesteijn, Rafael; Boede, Ernesto O.; Schmidt, Krzysztof; Lampo, Margarita; Viloria, Ángel L.; Carreño, Rafael; Robinson, Nathaniel; Lukacs, Paul M.; Nowak, J. Joshua; Salom-Pérez, Roberto; Castañeda, Franklin; Boron, Valeria; Quigley, Howard
2018-01-01
Broad scale population estimates of declining species are desired for conservation efforts. However, for many secretive species including large carnivores, such estimates are often difficult. Based on published density estimates obtained through camera trapping, presence/absence data, and globally available predictive variables derived from satellite imagery, we modelled density and occurrence of a large carnivore, the jaguar, across the species’ entire range. We then combined these models in a hierarchical framework to estimate the total population. Our models indicate that potential jaguar density is best predicted by measures of primary productivity, with the highest densities in the most productive tropical habitats and a clear declining gradient with distance from the equator. Jaguar distribution, in contrast, is determined by the combined effects of human impacts and environmental factors: probability of jaguar occurrence increased with forest cover, mean temperature, and annual precipitation and declined with increases in human foot print index and human density. Probability of occurrence was also significantly higher for protected areas than outside of them. We estimated the world’s jaguar population at 173,000 (95% CI: 138,000–208,000) individuals, mostly concentrated in the Amazon Basin; elsewhere, populations tend to be small and fragmented. The high number of jaguars results from the large total area still occupied (almost 9 million km2) and low human densities (< 1 person/km2) coinciding with high primary productivity in the core area of jaguar range. Our results show the importance of protected areas for jaguar persistence. We conclude that combining modelling of density and distribution can reveal ecological patterns and processes at global scales, can provide robust estimates for use in species assessments, and can guide broad-scale conservation actions. PMID:29579129
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-04-06
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
NASA Astrophysics Data System (ADS)
Ombadi, Mohammed; Nguyen, Phu; Sorooshian, Soroosh
2017-12-01
Intensity Duration Frequency (IDF) curves are essential for the resilient design of infrastructures. Since their earlier development, IDF relationships have been derived using precipitation records from rainfall gauge stations. However, with the recent advancement in satellite observation of precipitation which provides near global coverage and high spatiotemporal resolution, it is worthy of attention to investigate the validity of utilizing the relatively short record length of satellite rainfall to generate robust IDF relationships. These satellite-based IDF can address the paucity of such information in the developing countries. Few studies have used satellite precipitation data in IDF development but mainly focused on merging satellite and gauge precipitation. In this study, however, IDF have been derived solely from satellite observations using PERSIANN-CDR (Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks-Climate Data Record). The unique PERSIANN-CDR attributes of high spatial resolution (0.25°×0.25°), daily temporal resolution and a record dating back to 1983 allow for the investigation at fine resolution. The results are compared over most of the contiguous United States against NOAA Atlas 14. The impact of using different methods of sampling, distribution estimators and regionalization in the resulting relationships is investigated. Main challenges to estimate robust and accurate IDF from satellite observations are also highlighted.
Li, Dongming; Sun, Changming; Yang, Jinhua; Liu, Huan; Peng, Jiaqi; Zhang, Lijuan
2017-01-01
An adaptive optics (AO) system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods. PMID:28383503
End-of-winter snow depth variability on glaciers in Alaska
NASA Astrophysics Data System (ADS)
McGrath, Daniel; Sass, Louis; O'Neel, Shad; Arendt, Anthony; Wolken, Gabriel; Gusmeroli, Alessio; Kienholz, Christian; McNeil, Christopher
2015-08-01
A quantitative understanding of snow thickness and snow water equivalent (SWE) on glaciers is essential to a wide range of scientific and resource management topics. However, robust SWE estimates are observationally challenging, in part because SWE can vary abruptly over short distances in complex terrain due to interactions between topography and meteorological processes. In spring 2013, we measured snow accumulation on several glaciers around the Gulf of Alaska using both ground- and helicopter-based ground-penetrating radar surveys, complemented by extensive ground truth observations. We found that SWE can be highly variable (40% difference) over short spatial scales (tens to hundreds of meters), especially in the ablation zone where the underlying ice surfaces are typically rough. Elevation provides the dominant basin-scale influence on SWE, with gradients ranging from 115 to 400 mm/100 m. Regionally, total accumulation and the accumulation gradient are strongly controlled by a glacier's distance from the coastal moisture source. Multiple linear regressions, used to calculate distributed SWE fields, show that robust results require adequate sampling of the true distribution of multiple terrain parameters. Final SWE estimates (comparable to winter balances) show reasonable agreement with both the Parameter-elevation Relationships on Independent Slopes Model climate data set (9-36% difference) and the U.S. Geological Survey Alaska Benchmark Glaciers (6-36% difference). All the glaciers in our study exhibit substantial sensitivity to changing snow-rain fractions, regardless of their location in a coastal or continental climate. While process-based SWE projections remain elusive, the collection of ground-penetrating radar (GPR)-derived data sets provides a greatly enhanced perspective on the spatial distribution of SWE and will pave the way for future work that may eventually allow such projections.
NASA Astrophysics Data System (ADS)
Jankovic, Igor; Maghrebi, Mahdi; Fiori, Aldo; Zarlenga, Antonio; Dagan, Gedeon
2017-04-01
We examine the impact of permeability structures on the Breakthrough Curve (BTC) of solute, at a distance x from the injection plane, under mean uniform flow of mean velocity U. The study is carried out through accurate 3D numerical simulations, rather than the 2D models adopted in most of previous works. All structures share the same univariate distribution of the logconductivity Y = lnK and autocorrelation function ρY , but differ in higher order statistics. The main finding is that the BTC of ergodic plumes for the different examined structures is quite robust, displaying a seemingly "universal" behavior. The result is in variance with similar analyses carried out in the past for 2D permeability structures. The basic parameters (i.e. the geometric mean, the logconductivity variance σY 2 and the horizontal integral scale I) have to be identified from field data (e.g. core analysis, pumping test or other methods). However, prediction requires the knowledge of U, and the results suggest that improvement of the BTC prediction in applications can be achieved by independent estimates of the mean velocity U, e.g. by pumping tests, rather than attempting to characterize the permeability structure beyond its second-order characterization. The BTC prediction made by the Inverse Gaussian (IG) distribution, adopting the macrodispersion coefficient estimated by the First Order approximation αL = σY 2I, is also quite robust, providing a simple and effective solution to be employed in applications. The consequences of the latter result are further explored by modeling the mass distribution that occurred at the MADE-1 natural gradient experiment, for which we show that most of the plume features are adequately captured by the simple First Order approach.
Integrating multiple data sources in species distribution modeling: A framework for data fusion
Pacifici, Krishna; Reich, Brian J.; Miller, David A.W.; Gardner, Beth; Stauffer, Glenn E.; Singh, Susheela; McKerrow, Alexa; Collazo, Jaime A.
2017-01-01
The last decade has seen a dramatic increase in the use of species distribution models (SDMs) to characterize patterns of species’ occurrence and abundance. Efforts to parameterize SDMs often create a tension between the quality and quantity of data available to fit models. Estimation methods that integrate both standardized and non-standardized data types offer a potential solution to the tradeoff between data quality and quantity. Recently several authors have developed approaches for jointly modeling two sources of data (one of high quality and one of lesser quality). We extend their work by allowing for explicit spatial autocorrelation in occurrence and detection error using a Multivariate Conditional Autoregressive (MVCAR) model and develop three models that share information in a less direct manner resulting in more robust performance when the auxiliary data is of lesser quality. We describe these three new approaches (“Shared,” “Correlation,” “Covariates”) for combining data sources and show their use in a case study of the Brown-headed Nuthatch in the Southeastern U.S. and through simulations. All three of the approaches which used the second data source improved out-of-sample predictions relative to a single data source (“Single”). When information in the second data source is of high quality, the Shared model performs the best, but the Correlation and Covariates model also perform well. When the information quality in the second data source is of lesser quality, the Correlation and Covariates model performed better suggesting they are robust alternatives when little is known about auxiliary data collected opportunistically or through citizen scientists. Methods that allow for both data types to be used will maximize the useful information available for estimating species distributions.
Jarnevich, Catherine S.; Talbert, Marian; Morisette, Jeffrey T.; Aldridge, Cameron L.; Brown, Cynthia; Kumar, Sunil; Manier, Daniel; Talbert, Colin; Holcombe, Tracy R.
2017-01-01
Evaluating the conditions where a species can persist is an important question in ecology both to understand tolerances of organisms and to predict distributions across landscapes. Presence data combined with background or pseudo-absence locations are commonly used with species distribution modeling to develop these relationships. However, there is not a standard method to generate background or pseudo-absence locations, and method choice affects model outcomes. We evaluated combinations of both model algorithms (simple and complex generalized linear models, multivariate adaptive regression splines, Maxent, boosted regression trees, and random forest) and background methods (random, minimum convex polygon, and continuous and binary kernel density estimator (KDE)) to assess the sensitivity of model outcomes to choices made. We evaluated six questions related to model results, including five beyond the common comparison of model accuracy assessment metrics (biological interpretability of response curves, cross-validation robustness, independent data accuracy and robustness, and prediction consistency). For our case study with cheatgrass in the western US, random forest was least sensitive to background choice and the binary KDE method was least sensitive to model algorithm choice. While this outcome may not hold for other locations or species, the methods we used can be implemented to help determine appropriate methodologies for particular research questions.
Robust image modeling techniques with an image restoration application
NASA Astrophysics Data System (ADS)
Kashyap, Rangasami L.; Eom, Kie-Bum
1988-08-01
A robust parameter-estimation algorithm for a nonsymmetric half-plane (NSHP) autoregressive model, where the driving noise is a mixture of a Gaussian and an outlier process, is presented. The convergence of the estimation algorithm is proved. An algorithm to estimate parameters and original image intensity simultaneously from the impulse-noise-corrupted image, where the model governing the image is not available, is also presented. The robustness of the parameter estimates is demonstrated by simulation. Finally, an algorithm to restore realistic images is presented. The entire image generally does not obey a simple image model, but a small portion (e.g., 8 x 8) of the image is assumed to obey an NSHP model. The original image is divided into windows and the robust estimation algorithm is applied for each window. The restoration algorithm is tested by comparing it to traditional methods on several different images.
Counting Raindrops and the Distribution of Intervals Between Them.
NASA Astrophysics Data System (ADS)
Van De Giesen, N.; Ten Veldhuis, M. C.; Hut, R.; Pape, J. J.
2017-12-01
Drop size distributions are often assumed to follow a generalized gamma function, characterized by one parameter, Λ, [1]. In principle, this Λ can be estimated by measuring the arrival rate of raindrops. The arrival rate should follow a Poisson distribution. By measuring the distribution of the time intervals between drops arriving at a certain surface area, one should not only be able to estimate the arrival rate but also the robustness of the underlying assumption concerning steady state. It is important to note that many rainfall radar systems also assume fixeddrop size distributions, and associated arrival rates, to derive rainfall rates. By testing these relationships with a simple device, we will be able to improve both land-based and space-based radar rainfall estimates. Here, an open-hardware sensor design is presented, consisting of a 3D printed housing for a piezoelectric element, some simple electronics and an Arduino. The target audience for this device are citizen scientists who want to contribute to collecting rainfall information beyond the standard rain gauge. The core of the sensor is a simple piezo-buzzer, as found in many devices such as watches and fire alarms. When a raindrop falls on a piezo-buzzer, a small voltage is generated , which can be used to register the drop's arrival time. By registering the intervals between raindrops, the associated Poisson distribution can be estimated. In addition to the hardware, we will present the first results of a measuring campaign in Myanmar that will have ran from August to October 2017. All design files and descriptions are available through GitHub: https://github.com/nvandegiesen/Intervalometer. This research is partially supported through the TWIGA project, funded by the European Commission's H2020 program under call SC5-18-2017 `Novel in-situ observation systems'. Reference [1]: Uijlenhoet, R., and J. N. M. Stricker. "A consistent rainfall parameterization based on the exponential raindrop size distribution." Journal of Hydrology 218, no. 3 (1999): 101-127.
NASA Astrophysics Data System (ADS)
Raj, R.; Hamm, N. A. S.; van der Tol, C.; Stein, A.
2015-08-01
Gross primary production (GPP), separated from flux tower measurements of net ecosystem exchange (NEE) of CO2, is used increasingly to validate process-based simulators and remote sensing-derived estimates of simulated GPP at various time steps. Proper validation should include the uncertainty associated with this separation at different time steps. This can be achieved by using a Bayesian framework. In this study, we estimated the uncertainty in GPP at half hourly time steps. We used a non-rectangular hyperbola (NRH) model to separate GPP from flux tower measurements of NEE at the Speulderbos forest site, The Netherlands. The NRH model included the variables that influence GPP, in particular radiation, and temperature. In addition, the NRH model provided a robust empirical relationship between radiation and GPP by including the degree of curvature of the light response curve. Parameters of the NRH model were fitted to the measured NEE data for every 10-day period during the growing season (April to October) in 2009. Adopting a Bayesian approach, we defined the prior distribution of each NRH parameter. Markov chain Monte Carlo (MCMC) simulation was used to update the prior distribution of each NRH parameter. This allowed us to estimate the uncertainty in the separated GPP at half-hourly time steps. This yielded the posterior distribution of GPP at each half hour and allowed the quantification of uncertainty. The time series of posterior distributions thus obtained allowed us to estimate the uncertainty at daily time steps. We compared the informative with non-informative prior distributions of the NRH parameters. The results showed that both choices of prior produced similar posterior distributions GPP. This will provide relevant and important information for the validation of process-based simulators in the future. Furthermore, the obtained posterior distributions of NEE and the NRH parameters are of interest for a range of applications.
Monitoring gray wolf populations using multiple survey methods
Ausband, David E.; Rich, Lindsey N.; Glenn, Elizabeth M.; Mitchell, Michael S.; Zager, Pete; Miller, David A.W.; Waits, Lisette P.; Ackerman, Bruce B.; Mack, Curt M.
2013-01-01
The behavioral patterns and large territories of large carnivores make them challenging to monitor. Occupancy modeling provides a framework for monitoring population dynamics and distribution of territorial carnivores. We combined data from hunter surveys, howling and sign surveys conducted at predicted wolf rendezvous sites, and locations of radiocollared wolves to model occupancy and estimate the number of gray wolf (Canis lupus) packs and individuals in Idaho during 2009 and 2010. We explicitly accounted for potential misidentification of occupied cells (i.e., false positives) using an extension of the multi-state occupancy framework. We found agreement between model predictions and distribution and estimates of number of wolf packs and individual wolves reported by Idaho Department of Fish and Game and Nez Perce Tribe from intensive radiotelemetry-based monitoring. Estimates of individual wolves from occupancy models that excluded data from radiocollared wolves were within an average of 12.0% (SD = 6.0) of existing statewide minimum counts. Models using only hunter survey data generally estimated the lowest abundance, whereas models using all data generally provided the highest estimates of abundance, although only marginally higher. Precision across approaches ranged from 14% to 28% of mean estimates and models that used all data streams generally provided the most precise estimates. We demonstrated that an occupancy model based on different survey methods can yield estimates of the number and distribution of wolf packs and individual wolf abundance with reasonable measures of precision. Assumptions of the approach including that average territory size is known, average pack size is known, and territories do not overlap, must be evaluated periodically using independent field data to ensure occupancy estimates remain reliable. Use of multiple survey methods helps to ensure that occupancy estimates are robust to weaknesses or changes in any 1 survey method. Occupancy modeling may be useful for standardizing estimates across large landscapes, even if survey methods differ across regions, allowing for inferences about broad-scale population dynamics of wolves.
Robustness enhancement of neurocontroller and state estimator
NASA Technical Reports Server (NTRS)
Troudet, Terry
1993-01-01
The feasibility of enhancing neurocontrol robustness, through training of the neurocontroller and state estimator in the presence of system uncertainties, is investigated on the example of a multivariable aircraft control problem. The performance and robustness of the newly trained neurocontroller are compared to those for an existing neurocontrol design scheme. The newly designed dynamic neurocontroller exhibits a better trade-off between phase and gain stability margins, and it is significantly more robust to degradations of the plant dynamics.
MIDAS robust trend estimator for accurate GPS station velocities without step detection
Kreemer, Corné; Hammond, William C.; Gazeaux, Julien
2016-01-01
Abstract Automatic estimation of velocities from GPS coordinate time series is becoming required to cope with the exponentially increasing flood of available data, but problems detectable to the human eye are often overlooked. This motivates us to find an automatic and accurate estimator of trend that is resistant to common problems such as step discontinuities, outliers, seasonality, skewness, and heteroscedasticity. Developed here, Median Interannual Difference Adjusted for Skewness (MIDAS) is a variant of the Theil‐Sen median trend estimator, for which the ordinary version is the median of slopes vij = (xj–xi)/(tj–ti) computed between all data pairs i > j. For normally distributed data, Theil‐Sen and least squares trend estimates are statistically identical, but unlike least squares, Theil‐Sen is resistant to undetected data problems. To mitigate both seasonality and step discontinuities, MIDAS selects data pairs separated by 1 year. This condition is relaxed for time series with gaps so that all data are used. Slopes from data pairs spanning a step function produce one‐sided outliers that can bias the median. To reduce bias, MIDAS removes outliers and recomputes the median. MIDAS also computes a robust and realistic estimate of trend uncertainty. Statistical tests using GPS data in the rigid North American plate interior show ±0.23 mm/yr root‐mean‐square (RMS) accuracy in horizontal velocity. In blind tests using synthetic data, MIDAS velocities have an RMS accuracy of ±0.33 mm/yr horizontal, ±1.1 mm/yr up, with a 5th percentile range smaller than all 20 automatic estimators tested. Considering its general nature, MIDAS has the potential for broader application in the geosciences. PMID:27668140
MIDAS robust trend estimator for accurate GPS station velocities without step detection.
Blewitt, Geoffrey; Kreemer, Corné; Hammond, William C; Gazeaux, Julien
2016-03-01
Automatic estimation of velocities from GPS coordinate time series is becoming required to cope with the exponentially increasing flood of available data, but problems detectable to the human eye are often overlooked. This motivates us to find an automatic and accurate estimator of trend that is resistant to common problems such as step discontinuities, outliers, seasonality, skewness, and heteroscedasticity. Developed here, Median Interannual Difference Adjusted for Skewness (MIDAS) is a variant of the Theil-Sen median trend estimator, for which the ordinary version is the median of slopes v ij = ( x j -x i )/( t j -t i ) computed between all data pairs i > j . For normally distributed data, Theil-Sen and least squares trend estimates are statistically identical, but unlike least squares, Theil-Sen is resistant to undetected data problems. To mitigate both seasonality and step discontinuities, MIDAS selects data pairs separated by 1 year. This condition is relaxed for time series with gaps so that all data are used. Slopes from data pairs spanning a step function produce one-sided outliers that can bias the median. To reduce bias, MIDAS removes outliers and recomputes the median. MIDAS also computes a robust and realistic estimate of trend uncertainty. Statistical tests using GPS data in the rigid North American plate interior show ±0.23 mm/yr root-mean-square (RMS) accuracy in horizontal velocity. In blind tests using synthetic data, MIDAS velocities have an RMS accuracy of ±0.33 mm/yr horizontal, ±1.1 mm/yr up, with a 5th percentile range smaller than all 20 automatic estimators tested. Considering its general nature, MIDAS has the potential for broader application in the geosciences.
MIDAS robust trend estimator for accurate GPS station velocities without step detection
NASA Astrophysics Data System (ADS)
Blewitt, Geoffrey; Kreemer, Corné; Hammond, William C.; Gazeaux, Julien
2016-03-01
Automatic estimation of velocities from GPS coordinate time series is becoming required to cope with the exponentially increasing flood of available data, but problems detectable to the human eye are often overlooked. This motivates us to find an automatic and accurate estimator of trend that is resistant to common problems such as step discontinuities, outliers, seasonality, skewness, and heteroscedasticity. Developed here, Median Interannual Difference Adjusted for Skewness (MIDAS) is a variant of the Theil-Sen median trend estimator, for which the ordinary version is the median of slopes vij = (xj-xi)/(tj-ti) computed between all data pairs i > j. For normally distributed data, Theil-Sen and least squares trend estimates are statistically identical, but unlike least squares, Theil-Sen is resistant to undetected data problems. To mitigate both seasonality and step discontinuities, MIDAS selects data pairs separated by 1 year. This condition is relaxed for time series with gaps so that all data are used. Slopes from data pairs spanning a step function produce one-sided outliers that can bias the median. To reduce bias, MIDAS removes outliers and recomputes the median. MIDAS also computes a robust and realistic estimate of trend uncertainty. Statistical tests using GPS data in the rigid North American plate interior show ±0.23 mm/yr root-mean-square (RMS) accuracy in horizontal velocity. In blind tests using synthetic data, MIDAS velocities have an RMS accuracy of ±0.33 mm/yr horizontal, ±1.1 mm/yr up, with a 5th percentile range smaller than all 20 automatic estimators tested. Considering its general nature, MIDAS has the potential for broader application in the geosciences.
Experimental considerations for fast kurtosis imaging.
Hansen, Brian; Lund, Torben E; Sangill, Ryan; Stubbe, Ebbe; Finsterbusch, Jürgen; Jespersen, Sune Nørhøj
2016-11-01
The clinical use of kurtosis imaging is impeded by long acquisitions and postprocessing. Recently, estimation of mean kurtosis tensor W¯ and mean diffusivity ( D¯) was made possible from 13 distinct diffusion weighted MRI acquisitions (the 1-3-9 protocol) with simple postprocessing. Here, we analyze the effects of noise and nonideal diffusion encoding, and propose a new correction strategy. We also present a 1-9-9 protocol with increased robustness to experimental imperfections and minimal additional scan time. This refinement does not affect computation time and also provides a fast estimate of fractional anisotropy (FA). 1-3-9/1-9-9 data are acquired in rat and human brains, and estimates of D¯, FA, W¯ from human brains are compared with traditional estimates from an extensive diffusion kurtosis imaging data set. Simulations are used to evaluate the influence of noise and diffusion encodings deviating from the scheme, and the performance of the correction strategy. Optimal b-values are determined from simulations and data. Accuracy and precision in D¯ and W¯ are comparable to nonlinear least squares estimation, and is improved with the 1-9-9 protocol. The compensation strategy vastly improves parameter estimation in nonideal data. The framework offers a robust and compact method for estimating several diffusion metrics. The protocol is easily implemented. Magn Reson Med 76:1455-1468, 2016. © 2015 The Authors. Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2015 The Authors. Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine.
Defining robustness protocols: a method to include and evaluate robustness in clinical plans
NASA Astrophysics Data System (ADS)
McGowan, S. E.; Albertini, F.; Thomas, S. J.; Lomax, A. J.
2015-04-01
We aim to define a site-specific robustness protocol to be used during the clinical plan evaluation process. Plan robustness of 16 skull base IMPT plans to systematic range and random set-up errors have been retrospectively and systematically analysed. This was determined by calculating the error-bar dose distribution (ebDD) for all the plans and by defining some metrics used to define protocols aiding the plan assessment. Additionally, an example of how to clinically use the defined robustness database is given whereby a plan with sub-optimal brainstem robustness was identified. The advantage of using different beam arrangements to improve the plan robustness was analysed. Using the ebDD it was found range errors had a smaller effect on dose distribution than the corresponding set-up error in a single fraction, and that organs at risk were most robust to the range errors, whereas the target was more robust to set-up errors. A database was created to aid planners in terms of plan robustness aims in these volumes. This resulted in the definition of site-specific robustness protocols. The use of robustness constraints allowed for the identification of a specific patient that may have benefited from a treatment of greater individuality. A new beam arrangement showed to be preferential when balancing conformality and robustness for this case. The ebDD and error-bar volume histogram proved effective in analysing plan robustness. The process of retrospective analysis could be used to establish site-specific robustness planning protocols in proton therapy. These protocols allow the planner to determine plans that, although delivering a dosimetrically adequate dose distribution, have resulted in sub-optimal robustness to these uncertainties. For these cases the use of different beam start conditions may improve the plan robustness to set-up and range uncertainties.
Maximum Likelihood Estimations and EM Algorithms with Length-biased Data
Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu
2012-01-01
SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840
Robust spike sorting of retinal ganglion cells tuned to spot stimuli.
Ghahari, Alireza; Badea, Tudor C
2016-08-01
We propose an automatic spike sorting approach for the data recorded from a microelectrode array during visual stimulation of wild type retinas with tiled spot stimuli. The approach first detects individual spikes per electrode by their signature local minima. With the mixture probability distribution of the local minima estimated afterwards, it applies a minimum-squared-error clustering algorithm to sort the spikes into different clusters. A template waveform for each cluster per electrode is defined, and a number of reliability tests are performed on it and its corresponding spikes. Finally, a divisive hierarchical clustering algorithm is used to deal with the correlated templates per cluster type across all the electrodes. According to the measures of performance of the spike sorting approach, it is robust even in the cases of recordings with low signal-to-noise ratio.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Bernardi, E., E-mail: elisabetta.debernardi@unimib.it; Ricotti, R.; Riboldi, M.
2016-02-15
Purpose: An innovative strategy to improve the sensitivity of positron emission tomography (PET)-based treatment verification in ion beam radiotherapy is proposed. Methods: Low counting statistics PET images acquired during or shortly after the treatment (Measured PET) and a Monte Carlo estimate of the same PET images derived from the treatment plan (Expected PET) are considered as two frames of a 4D dataset. A 4D maximum likelihood reconstruction strategy was adapted to iteratively estimate the annihilation events distribution in a reference frame and the deformation motion fields that map it in the Expected PET and Measured PET frames. The outputs generatedmore » by the proposed strategy are as follows: (1) an estimate of the Measured PET with an image quality comparable to the Expected PET and (2) an estimate of the motion field mapping Expected PET to Measured PET. The details of the algorithm are presented and the strategy is preliminarily tested on analytically simulated datasets. Results: The algorithm demonstrates (1) robustness against noise, even in the worst conditions where 1.5 × 10{sup 4} true coincidences and a random fraction of 73% are simulated; (2) a proper sensitivity to different kind and grade of mismatches ranging between 1 and 10 mm; (3) robustness against bias due to incorrect washout modeling in the Monte Carlo simulation up to 1/3 of the original signal amplitude; and (4) an ability to describe the mismatch even in presence of complex annihilation distributions such as those induced by two perpendicular superimposed ion fields. Conclusions: The promising results obtained in this work suggest the applicability of the method as a quantification tool for PET-based treatment verification in ion beam radiotherapy. An extensive assessment of the proposed strategy on real treatment verification data is planned.« less
Sidler, Dominik; Schwaninger, Arthur; Riniker, Sereina
2016-10-21
In molecular dynamics (MD) simulations, free-energy differences are often calculated using free energy perturbation or thermodynamic integration (TI) methods. However, both techniques are only suited to calculate free-energy differences between two end states. Enveloping distribution sampling (EDS) presents an attractive alternative that allows to calculate multiple free-energy differences in a single simulation. In EDS, a reference state is simulated which "envelopes" the end states. The challenge of this methodology is the determination of optimal reference-state parameters to ensure equal sampling of all end states. Currently, the automatic determination of the reference-state parameters for multiple end states is an unsolved issue that limits the application of the methodology. To resolve this, we have generalised the replica-exchange EDS (RE-EDS) approach, introduced by Lee et al. [J. Chem. Theory Comput. 10, 2738 (2014)] for constant-pH MD simulations. By exchanging configurations between replicas with different reference-state parameters, the complexity of the parameter-choice problem can be substantially reduced. A new robust scheme to estimate the reference-state parameters from a short initial RE-EDS simulation with default parameters was developed, which allowed the calculation of 36 free-energy differences between nine small-molecule inhibitors of phenylethanolamine N-methyltransferase from a single simulation. The resulting free-energy differences were in excellent agreement with values obtained previously by TI and two-state EDS simulations.
Stenroos, Matti; Hauk, Olaf
2013-01-01
The conductivity profile of the head has a major effect on EEG signals, but unfortunately the conductivity for the most important compartment, skull, is only poorly known. In dipole modeling studies, errors in modeled skull conductivity have been considered to have a detrimental effect on EEG source estimation. However, as dipole models are very restrictive, those results cannot be generalized to other source estimation methods. In this work, we studied the sensitivity of EEG and combined MEG + EEG source estimation to errors in skull conductivity using a distributed source model and minimum-norm (MN) estimation. We used a MEG/EEG modeling set-up that reflected state-of-the-art practices of experimental research. Cortical surfaces were segmented and realistically-shaped three-layer anatomical head models were constructed, and forward models were built with Galerkin boundary element method while varying the skull conductivity. Lead-field topographies and MN spatial filter vectors were compared across conductivities, and the localization and spatial spread of the MN estimators were assessed using intuitive resolution metrics. The results showed that the MN estimator is robust against errors in skull conductivity: the conductivity had a moderate effect on amplitudes of lead fields and spatial filter vectors, but the effect on corresponding morphologies was small. The localization performance of the EEG or combined MEG + EEG MN estimator was only minimally affected by the conductivity error, while the spread of the estimate varied slightly. Thus, the uncertainty with respect to skull conductivity should not prevent researchers from applying minimum norm estimation to EEG or combined MEG + EEG data. Comparing our results to those obtained earlier with dipole models shows that general judgment on the performance of an imaging modality should not be based on analysis with one source estimation method only. PMID:23639259
Inverse modeling of Asian (222)Rn flux using surface air (222)Rn concentration.
Hirao, Shigekazu; Yamazawa, Hiromi; Moriizumi, Jun
2010-11-01
When used with an atmospheric transport model, the (222)Rn flux distribution estimated in our previous study using soil transport theory caused underestimation of atmospheric (222)Rn concentrations as compared with measurements in East Asia. In this study, we applied a Bayesian synthesis inverse method to produce revised estimates of the annual (222)Rn flux density in Asia by using atmospheric (222)Rn concentrations measured at seven sites in East Asia. The Bayesian synthesis inverse method requires a prior estimate of the flux distribution and its uncertainties. The atmospheric transport model MM5/HIRAT and our previous estimate of the (222)Rn flux distribution as the prior value were used to generate new flux estimates for the eastern half of the Eurasian continent dividing into 10 regions. The (222)Rn flux densities estimated using the Bayesian inversion technique were generally higher than the prior flux densities. The area-weighted average (222)Rn flux density for Asia was estimated to be 33.0 mBq m(-2) s(-1), which is substantially higher than the prior value (16.7 mBq m(-2) s(-1)). The estimated (222)Rn flux densities decrease with increasing latitude as follows: Southeast Asia (36.7 mBq m(-2) s(-1)); East Asia (28.6 mBq m(-2) s(-1)) including China, Korean Peninsula and Japan; and Siberia (14.1 mBq m(-2) s(-1)). Increase of the newly estimated fluxes in Southeast Asia, China, Japan, and the southern part of Eastern Siberia from the prior ones contributed most significantly to improved agreement of the model-calculated concentrations with the atmospheric measurements. The sensitivity analysis of prior flux errors and effects of locally exhaled (222)Rn showed that the estimated fluxes in Northern and Central China, Korea, Japan, and the southern part of Eastern Siberia were robust, but that in Central Asia had a large uncertainty.
Estimation of blade airloads from rotor blade bending moments
NASA Technical Reports Server (NTRS)
Bousman, William G.
1987-01-01
A method is developed to estimate the blade normal airloads by using measured flap bending moments; that is, the rotor blade is used as a force balance. The blade's rotation is calculated in vacuum modes and the airloads are then expressed as an algebraic sum of the mode shapes, modal amplitudes, mass distribution, and frequency properties. The modal amplitudes are identified from the blade bending moments using the Strain Pattern Analysis Method. The application of the method is examined using simulated flap bending moment data that have been calculated for measured airloads for a full-scale rotor in a wind tunnel. The estimated airloads are compared with the wind tunnel measurements. The effects of the number of measurements, the number of modes, and errors in the measurements and the blade properties are examined, and the method is shown to be robust.
NASA Astrophysics Data System (ADS)
Fattoruso, Grazia; Longobardi, Antonia; Pizzuti, Alfredo; Molinara, Mario; Marocco, Claudio; De Vito, Saverio; Tortorella, Francesco; Di Francia, Girolamo
2017-06-01
Rainfall data collection gathered in continuous by a distributed rain gauge network is instrumental to more effective hydro-geological risk forecasting and management services though the input estimated rainfall fields suffer from prediction uncertainty. Optimal rain gauge networks can generate accurate estimated rainfall fields. In this research work, a methodology has been investigated for evaluating an optimal rain gauges network aimed at robust hydrogeological hazard investigations. The rain gauges of the Sarno River basin (Southern Italy) has been evaluated by optimizing a two-objective function that maximizes the estimated accuracy and minimizes the total metering cost through the variance reduction algorithm along with the climatological variogram (time-invariant). This problem has been solved by using an enumerative search algorithm, evaluating the exact Pareto-front by an efficient computational time.
Self Calibrated Wireless Distributed Environmental Sensory Networks
Fishbain, Barak; Moreno-Centeno, Erick
2016-01-01
Recent advances in sensory and communication technologies have made Wireless Distributed Environmental Sensory Networks (WDESN) technically and economically feasible. WDESNs present an unprecedented tool for studying many environmental processes in a new way. However, the WDESNs’ calibration process is a major obstacle in them becoming the common practice. Here, we present a new, robust and efficient method for aggregating measurements acquired by an uncalibrated WDESN, and producing accurate estimates of the observed environmental variable’s true levels rendering the network as self-calibrated. The suggested method presents novelty both in group-decision-making and in environmental sensing as it offers a most valuable tool for distributed environmental monitoring data aggregation. Applying the method on an extensive real-life air-pollution dataset showed markedly more accurate results than the common practice and the state-of-the-art. PMID:27098279
An Anisotropic A posteriori Error Estimator for CFD
NASA Astrophysics Data System (ADS)
Feijóo, Raúl A.; Padra, Claudio; Quintana, Fernando
In this article, a robust anisotropic adaptive algorithm is presented, to solve compressible-flow equations using a stabilized CFD solver and automatic mesh generators. The association includes a mesh generator, a flow solver, and an a posteriori error-estimator code. The estimator was selected among several choices available (Almeida et al. (2000). Comput. Methods Appl. Mech. Engng, 182, 379-400; Borges et al. (1998). "Computational mechanics: new trends and applications". Proceedings of the 4th World Congress on Computational Mechanics, Bs.As., Argentina) giving a powerful computational tool. The main aim is to capture solution discontinuities, in this case, shocks, using the least amount of computational resources, i.e. elements, compatible with a solution of good quality. This leads to high aspect-ratio elements (stretching). To achieve this, a directional error estimator was specifically selected. The numerical results show good behavior of the error estimator, resulting in strongly-adapted meshes in few steps, typically three or four iterations, enough to capture shocks using a moderate and well-distributed amount of elements.
Robust THP Transceiver Designs for Multiuser MIMO Downlink with Imperfect CSIT
NASA Astrophysics Data System (ADS)
Ubaidulla, P.; Chockalingam, A.
2009-12-01
We present robust joint nonlinear transceiver designs for multiuser multiple-input multiple-output (MIMO) downlink in the presence of imperfections in the channel state information at the transmitter (CSIT). The base station (BS) is equipped with multiple transmit antennas, and each user terminal is equipped with one or more receive antennas. The BS employs Tomlinson-Harashima precoding (THP) for interuser interference precancellation at the transmitter. We consider robust transceiver designs that jointly optimize the transmit THP filters and receive filter for two models of CSIT errors. The first model is a stochastic error (SE) model, where the CSIT error is Gaussian-distributed. This model is applicable when the CSIT error is dominated by channel estimation error. In this case, the proposed robust transceiver design seeks to minimize a stochastic function of the sum mean square error (SMSE) under a constraint on the total BS transmit power. We propose an iterative algorithm to solve this problem. The other model we consider is a norm-bounded error (NBE) model, where the CSIT error can be specified by an uncertainty set. This model is applicable when the CSIT error is dominated by quantization errors. In this case, we consider a worst-case design. For this model, we consider robust (i) minimum SMSE, (ii) MSE-constrained, and (iii) MSE-balancing transceiver designs. We propose iterative algorithms to solve these problems, wherein each iteration involves a pair of semidefinite programs (SDPs). Further, we consider an extension of the proposed algorithm to the case with per-antenna power constraints. We evaluate the robustness of the proposed algorithms to imperfections in CSIT through simulation, and show that the proposed robust designs outperform nonrobust designs as well as robust linear transceiver designs reported in the recent literature.
The Robustness of LISREL Estimates in Structural Equation Models with Categorical Variables.
ERIC Educational Resources Information Center
Ethington, Corinna A.
1987-01-01
This study examined the effect of type of correlation matrix on the robustness of LISREL maximum likelihood and unweighted least squares structural parameter estimates for models with categorical variables. The analysis of mixed matrices produced estimates that closely approximated the model parameters except where dichotomous variables were…
Statistical distributions of extreme dry spell in Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Zin, Wan Zawiah Wan; Jemain, Abdul Aziz
2010-11-01
Statistical distributions of annual extreme (AE) series and partial duration (PD) series for dry-spell event are analyzed for a database of daily rainfall records of 50 rain-gauge stations in Peninsular Malaysia, with recording period extending from 1975 to 2004. The three-parameter generalized extreme value (GEV) and generalized Pareto (GP) distributions are considered to model both series. In both cases, the parameters of these two distributions are fitted by means of the L-moments method, which provides a robust estimation of them. The goodness-of-fit (GOF) between empirical data and theoretical distributions are then evaluated by means of the L-moment ratio diagram and several goodness-of-fit tests for each of the 50 stations. It is found that for the majority of stations, the AE and PD series are well fitted by the GEV and GP models, respectively. Based on the models that have been identified, we can reasonably predict the risks associated with extreme dry spells for various return periods.
NASA Astrophysics Data System (ADS)
Mamalakis, Antonios; Langousis, Andreas; Deidda, Roberto; Marrocu, Marino
2017-03-01
Distribution mapping has been identified as the most efficient approach to bias-correct climate model rainfall, while reproducing its statistics at spatial and temporal resolutions suitable to run hydrologic models. Yet its implementation based on empirical distributions derived from control samples (referred to as nonparametric distribution mapping) makes the method's performance sensitive to sample length variations, the presence of outliers, the spatial resolution of climate model results, and may lead to biases, especially in extreme rainfall estimation. To address these shortcomings, we propose a methodology for simultaneous bias correction and high-resolution downscaling of climate model rainfall products that uses: (a) a two-component theoretical distribution model (i.e., a generalized Pareto (GP) model for rainfall intensities above a specified threshold u*, and an exponential model for lower rainrates), and (b) proper interpolation of the corresponding distribution parameters on a user-defined high-resolution grid, using kriging for uncertain data. We assess the performance of the suggested parametric approach relative to the nonparametric one, using daily raingauge measurements from a dense network in the island of Sardinia (Italy), and rainfall data from four GCM/RCM model chains of the ENSEMBLES project. The obtained results shed light on the competitive advantages of the parametric approach, which is proved more accurate and considerably less sensitive to the characteristics of the calibration period, independent of the GCM/RCM combination used. This is especially the case for extreme rainfall estimation, where the GP assumption allows for more accurate and robust estimates, also beyond the range of the available data.
A note on variance estimation in random effects meta-regression.
Sidik, Kurex; Jonkman, Jeffrey N
2005-01-01
For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.
Robust Distribution Network Reconfiguration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Changhyeok; Liu, Cong; Mehrotra, Sanjay
2015-03-01
We propose a two-stage robust optimization model for the distribution network reconfiguration problem with load uncertainty. The first-stage decision is to configure the radial distribution network and the second-stage decision is to find the optimal a/c power flow of the reconfigured network for given demand realization. We solve the two-stage robust model by using a column-and-constraint generation algorithm, where the master problem and subproblem are formulated as mixed-integer second-order cone programs. Computational results for 16, 33, 70, and 94-bus test cases are reported. We find that the configuration from the robust model does not compromise much the power loss undermore » the nominal load scenario compared to the configuration from the deterministic model, yet it provides the reliability of the distribution system for all scenarios in the uncertainty set.« less
Geostatistical assessment of Pb in soil around Paris, France.
Saby, N; Arrouays, D; Boulonne, L; Jolivet, C; Pochot, A
2006-08-15
This paper presents a survey on soil Pb contamination around Paris (France) using the French soil monitoring network. The first aim of this study is to estimate the total amount of anthropogenic Pb inputs in soils and to distinguish Pb due to diffuse pollution from geochemical background Pb. Secondly, this study tries to find the main controlling factors of the spatial distribution of anthropogenic Pb. We used the technique of relative topsoil enhancement to evaluate the anthropogenic stock of Pb and we performed lognormal kriging to map Pb regional distribution. The results show a strong gradient of anthropogenic stock of Pb around the urban Paris area. We estimate a total amount of anthropogenic stock of Pb close to 143,000 metric tons, which corresponds to an average accumulation of 5.9 t km(-2). Our study suggests that a grid-based survey can help to quantify diffuse Pb contamination by using robust techniques of calculation and that it might also be used to validate predictions of deposition models.
Estimating disease prevalence in two-phase studies.
Alonzo, Todd A; Pepe, Margaret Sullivan; Lumley, Thomas
2003-04-01
Disease prevalence is ideally estimated using a 'gold standard' to ascertain true disease status on all subjects in a population of interest. In practice, however, the gold standard may be too costly or invasive to be applied to all subjects, in which case a two-phase design is often employed. Phase 1 data consisting of inexpensive and non-invasive screening tests on all study subjects are used to determine the subjects that receive the gold standard in the second phase. Naive estimates of prevalence in two-phase studies can be biased (verification bias). Imputation and re-weighting estimators are often used to avoid this bias. We contrast the forms and attributes of the various prevalence estimators. Distribution theory and simulation studies are used to investigate their bias and efficiency. We conclude that the semiparametric efficient approach is the preferred method for prevalence estimation in two-phase studies. It is more robust and comparable in its efficiency to imputation and other re-weighting estimators. It is also easy to implement. We use this approach to examine the prevalence of depression in adolescents with data from the Great Smoky Mountain Study.
Daly, Caitlin H; Higgins, Victoria; Adeli, Khosrow; Grey, Vijay L; Hamid, Jemila S
2017-12-01
To statistically compare and evaluate commonly used methods of estimating reference intervals and to determine which method is best based on characteristics of the distribution of various data sets. Three approaches for estimating reference intervals, i.e. parametric, non-parametric, and robust, were compared with simulated Gaussian and non-Gaussian data. The hierarchy of the performances of each method was examined based on bias and measures of precision. The findings of the simulation study were illustrated through real data sets. In all Gaussian scenarios, the parametric approach provided the least biased and most precise estimates. In non-Gaussian scenarios, no single method provided the least biased and most precise estimates for both limits of a reference interval across all sample sizes, although the non-parametric approach performed the best for most scenarios. The hierarchy of the performances of the three methods was only impacted by sample size and skewness. Differences between reference interval estimates established by the three methods were inflated by variability. Whenever possible, laboratories should attempt to transform data to a Gaussian distribution and use the parametric approach to obtain the most optimal reference intervals. When this is not possible, laboratories should consider sample size and skewness as factors in their choice of reference interval estimation method. The consequences of false positives or false negatives may also serve as factors in this decision. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Dichotomisation using a distributional approach when the outcome is skewed.
Sauzet, Odile; Ofuya, Mercy; Peacock, Janet L
2015-04-24
Dichotomisation of continuous outcomes has been rightly criticised by statisticians because of the loss of information incurred. However to communicate a comparison of risks, dichotomised outcomes may be necessary. Peacock et al. developed a distributional approach to the dichotomisation of normally distributed outcomes allowing the presentation of a comparison of proportions with a measure of precision which reflects the comparison of means. Many common health outcomes are skewed so that the distributional method for the dichotomisation of continuous outcomes may not apply. We present a methodology to obtain dichotomised outcomes for skewed variables illustrated with data from several observational studies. We also report the results of a simulation study which tests the robustness of the method to deviation from normality and assess the validity of the newly developed method. The review showed that the pattern of dichotomisation was varying between outcomes. Birthweight, Blood pressure and BMI can either be transformed to normal so that normal distributional estimates for a comparison of proportions can be obtained or better, the skew-normal method can be used. For gestational age, no satisfactory transformation is available and only the skew-normal method is reliable. The normal distributional method is reliable also when there are small deviations from normality. The distributional method with its applicability for common skewed data allows researchers to provide both continuous and dichotomised estimates without losing information or precision. This will have the effect of providing a practical understanding of the difference in means in terms of proportions.
Regression without truth with Markov chain Monte-Carlo
NASA Astrophysics Data System (ADS)
Madan, Hennadii; Pernuš, Franjo; Likar, Boštjan; Å piclin, Žiga
2017-03-01
Regression without truth (RWT) is a statistical technique for estimating error model parameters of each method in a group of methods used for measurement of a certain quantity. A very attractive aspect of RWT is that it does not rely on a reference method or "gold standard" data, which is otherwise difficult RWT was used for a reference-free performance comparison of several methods for measuring left ventricular ejection fraction (EF), i.e. a percentage of blood leaving the ventricle each time the heart contracts, and has since been applied for various other quantitative imaging biomarkerss (QIBs). Herein, we show how Markov chain Monte-Carlo (MCMC), a computational technique for drawing samples from a statistical distribution with probability density function known only up to a normalizing coefficient, can be used to augment RWT to gain a number of important benefits compared to the original approach based on iterative optimization. For instance, the proposed MCMC-based RWT enables the estimation of joint posterior distribution of the parameters of the error model, straightforward quantification of uncertainty of the estimates, estimation of true value of the measurand and corresponding credible intervals (CIs), does not require a finite support for prior distribution of the measureand generally has a much improved robustness against convergence to non-global maxima. The proposed approach is validated using synthetic data that emulate the EF data for 45 patients measured with 8 different methods. The obtained results show that 90% CI of the corresponding parameter estimates contain the true values of all error model parameters and the measurand. A potential real-world application is to take measurements of a certain QIB several different methods and then use the proposed framework to compute the estimates of the true values and their uncertainty, a vital information for diagnosis based on QIB.
England, John F.; Salas, José D.; Jarrett, Robert D.
2003-01-01
The expected moments algorithm (EMA) [Cohn et al., 1997] and the Bulletin 17B [Interagency Committee on Water Data, 1982] historical weighting procedure (B17H) for the log Pearson type III distribution are compared by Monte Carlo computer simulation for cases in which historical and/or paleoflood data are available. The relative performance of the estimators was explored for three cases: fixed‐threshold exceedances, a fixed number of large floods, and floods generated from a different parent distribution. EMA can effectively incorporate four types of historical and paleoflood data: floods where the discharge is explicitly known, unknown discharges below a single threshold, floods with unknown discharge that exceed some level, and floods with discharges described in a range. The B17H estimator can utilize only the first two types of historical information. Including historical/paleoflood data in the simulation experiments significantly improved the quantile estimates in terms of mean square error and bias relative to using gage data alone. EMA performed significantly better than B17H in nearly all cases considered. B17H performed as well as EMA for estimating X100 in some limited fixed‐threshold exceedance cases. EMA performed comparatively much better in other fixed‐threshold situations, for the single large flood case, and in cases when estimating extreme floods equal to or greater than X500. B17H did not fully utilize historical information when the historical period exceeded 200 years. Robustness studies using GEV‐simulated data confirmed that EMA performed better than B17H. Overall, EMA is preferred to B17H when historical and paleoflood data are available for flood frequency analysis.
NASA Astrophysics Data System (ADS)
England, John F.; Salas, José D.; Jarrett, Robert D.
2003-09-01
The expected moments algorithm (EMA) [, 1997] and the Bulletin 17B [, 1982] historical weighting procedure (B17H) for the log Pearson type III distribution are compared by Monte Carlo computer simulation for cases in which historical and/or paleoflood data are available. The relative performance of the estimators was explored for three cases: fixed-threshold exceedances, a fixed number of large floods, and floods generated from a different parent distribution. EMA can effectively incorporate four types of historical and paleoflood data: floods where the discharge is explicitly known, unknown discharges below a single threshold, floods with unknown discharge that exceed some level, and floods with discharges described in a range. The B17H estimator can utilize only the first two types of historical information. Including historical/paleoflood data in the simulation experiments significantly improved the quantile estimates in terms of mean square error and bias relative to using gage data alone. EMA performed significantly better than B17H in nearly all cases considered. B17H performed as well as EMA for estimating X100 in some limited fixed-threshold exceedance cases. EMA performed comparatively much better in other fixed-threshold situations, for the single large flood case, and in cases when estimating extreme floods equal to or greater than X500. B17H did not fully utilize historical information when the historical period exceeded 200 years. Robustness studies using GEV-simulated data confirmed that EMA performed better than B17H. Overall, EMA is preferred to B17H when historical and paleoflood data are available for flood frequency analysis.
Distribution-Agnostic Stochastic Optimal Power Flow for Distribution Grids: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Kyri; Dall'Anese, Emiliano; Summers, Tyler
2016-09-01
This paper outlines a data-driven, distributionally robust approach to solve chance-constrained AC optimal power flow problems in distribution networks. Uncertain forecasts for loads and power generated by photovoltaic (PV) systems are considered, with the goal of minimizing PV curtailment while meeting power flow and voltage regulation constraints. A data- driven approach is utilized to develop a distributionally robust conservative convex approximation of the chance-constraints; particularly, the mean and covariance matrix of the forecast errors are updated online, and leveraged to enforce voltage regulation with predetermined probability via Chebyshev-based bounds. By combining an accurate linear approximation of the AC power flowmore » equations with the distributionally robust chance constraint reformulation, the resulting optimization problem becomes convex and computationally tractable.« less
Robust Statistical Approaches for RSS-Based Floor Detection in Indoor Localization.
Razavi, Alireza; Valkama, Mikko; Lohan, Elena Simona
2016-05-31
Floor detection for indoor 3D localization of mobile devices is currently an important challenge in the wireless world. Many approaches currently exist, but usually the robustness of such approaches is not addressed or investigated. The goal of this paper is to show how to robustify the floor estimation when probabilistic approaches with a low number of parameters are employed. Indeed, such an approach would allow a building-independent estimation and a lower computing power at the mobile side. Four robustified algorithms are to be presented: a robust weighted centroid localization method, a robust linear trilateration method, a robust nonlinear trilateration method, and a robust deconvolution method. The proposed approaches use the received signal strengths (RSS) measured by the Mobile Station (MS) from various heard WiFi access points (APs) and provide an estimate of the vertical position of the MS, which can be used for floor detection. We will show that robustification can indeed increase the performance of the RSS-based floor detection algorithms.
Predicting climate change: Uncertainties and prospects for surmounting them
NASA Astrophysics Data System (ADS)
Ghil, Michael
2008-03-01
General circulation models (GCMs) are among the most detailed and sophisticated models of natural phenomena in existence. Still, the lack of robust and efficient subgrid-scale parametrizations for GCMs, along with the inherent sensitivity to initial data and the complex nonlinearities involved, present a major and persistent obstacle to narrowing the range of estimates for end-of-century warming. Estimating future changes in the distribution of climatic extrema is even more difficult. Brute-force tuning the large number of GCM parameters does not appear to help reduce the uncertainties. Andronov and Pontryagin (1937) proposed structural stability as a way to evaluate model robustness. Unfortunately, many real-world systems proved to be structurally unstable. We illustrate these concepts with a very simple model for the El Niño--Southern Oscillation (ENSO). Our model is governed by a differential delay equation with a single delay and periodic (seasonal) forcing. Like many of its more or less detailed and realistic precursors, this model exhibits a Devil's staircase. We study the model's structural stability, describe the mechanisms of the observed instabilities, and connect our findings to ENSO phenomenology. In the model's phase-parameter space, regions of smooth dependence on parameters alternate with rough, fractal ones. We then apply the tools of random dynamical systems and stochastic structural stability to the circle map and a torus map. The effect of noise with compact support on these maps is fairly intuitive: it is the most robust structures in phase-parameter space that survive the smoothing introduced by the noise. The nature of the stochastic forcing matters, thus suggesting that certain types of stochastic parametrizations might be better than others in achieving GCM robustness. This talk represents joint work with M. Chekroun, E. Simonnet and I. Zaliapin.
Developing appropriate methods for cost-effectiveness analysis of cluster randomized trials.
Gomes, Manuel; Ng, Edmond S-W; Grieve, Richard; Nixon, Richard; Carpenter, James; Thompson, Simon G
2012-01-01
Cost-effectiveness analyses (CEAs) may use data from cluster randomized trials (CRTs), where the unit of randomization is the cluster, not the individual. However, most studies use analytical methods that ignore clustering. This article compares alternative statistical methods for accommodating clustering in CEAs of CRTs. Our simulation study compared the performance of statistical methods for CEAs of CRTs with 2 treatment arms. The study considered a method that ignored clustering--seemingly unrelated regression (SUR) without a robust standard error (SE)--and 4 methods that recognized clustering--SUR and generalized estimating equations (GEEs), both with robust SE, a "2-stage" nonparametric bootstrap (TSB) with shrinkage correction, and a multilevel model (MLM). The base case assumed CRTs with moderate numbers of balanced clusters (20 per arm) and normally distributed costs. Other scenarios included CRTs with few clusters, imbalanced cluster sizes, and skewed costs. Performance was reported as bias, root mean squared error (rMSE), and confidence interval (CI) coverage for estimating incremental net benefits (INBs). We also compared the methods in a case study. Each method reported low levels of bias. Without the robust SE, SUR gave poor CI coverage (base case: 0.89 v. nominal level: 0.95). The MLM and TSB performed well in each scenario (CI coverage, 0.92-0.95). With few clusters, the GEE and SUR (with robust SE) had coverage below 0.90. In the case study, the mean INBs were similar across all methods, but ignoring clustering underestimated statistical uncertainty and the value of further research. MLMs and the TSB are appropriate analytical methods for CEAs of CRTs with the characteristics described. SUR and GEE are not recommended for studies with few clusters.
Developing Appropriate Methods for Cost-Effectiveness Analysis of Cluster Randomized Trials
Gomes, Manuel; Ng, Edmond S.-W.; Nixon, Richard; Carpenter, James; Thompson, Simon G.
2012-01-01
Aim. Cost-effectiveness analyses (CEAs) may use data from cluster randomized trials (CRTs), where the unit of randomization is the cluster, not the individual. However, most studies use analytical methods that ignore clustering. This article compares alternative statistical methods for accommodating clustering in CEAs of CRTs. Methods. Our simulation study compared the performance of statistical methods for CEAs of CRTs with 2 treatment arms. The study considered a method that ignored clustering—seemingly unrelated regression (SUR) without a robust standard error (SE)—and 4 methods that recognized clustering—SUR and generalized estimating equations (GEEs), both with robust SE, a “2-stage” nonparametric bootstrap (TSB) with shrinkage correction, and a multilevel model (MLM). The base case assumed CRTs with moderate numbers of balanced clusters (20 per arm) and normally distributed costs. Other scenarios included CRTs with few clusters, imbalanced cluster sizes, and skewed costs. Performance was reported as bias, root mean squared error (rMSE), and confidence interval (CI) coverage for estimating incremental net benefits (INBs). We also compared the methods in a case study. Results. Each method reported low levels of bias. Without the robust SE, SUR gave poor CI coverage (base case: 0.89 v. nominal level: 0.95). The MLM and TSB performed well in each scenario (CI coverage, 0.92–0.95). With few clusters, the GEE and SUR (with robust SE) had coverage below 0.90. In the case study, the mean INBs were similar across all methods, but ignoring clustering underestimated statistical uncertainty and the value of further research. Conclusions. MLMs and the TSB are appropriate analytical methods for CEAs of CRTs with the characteristics described. SUR and GEE are not recommended for studies with few clusters. PMID:22016450
Li, Peng; Redden, David T.
2014-01-01
SUMMARY The sandwich estimator in generalized estimating equations (GEE) approach underestimates the true variance in small samples and consequently results in inflated type I error rates in hypothesis testing. This fact limits the application of the GEE in cluster-randomized trials (CRTs) with few clusters. Under various CRT scenarios with correlated binary outcomes, we evaluate the small sample properties of the GEE Wald tests using bias-corrected sandwich estimators. Our results suggest that the GEE Wald z test should be avoided in the analyses of CRTs with few clusters even when bias-corrected sandwich estimators are used. With t-distribution approximation, the Kauermann and Carroll (KC)-correction can keep the test size to nominal levels even when the number of clusters is as low as 10, and is robust to the moderate variation of the cluster sizes. However, in cases with large variations in cluster sizes, the Fay and Graubard (FG)-correction should be used instead. Furthermore, we derive a formula to calculate the power and minimum total number of clusters one needs using the t test and KC-correction for the CRTs with binary outcomes. The power levels as predicted by the proposed formula agree well with the empirical powers from the simulations. The proposed methods are illustrated using real CRT data. We conclude that with appropriate control of type I error rates under small sample sizes, we recommend the use of GEE approach in CRTs with binary outcomes due to fewer assumptions and robustness to the misspecification of the covariance structure. PMID:25345738
Gravity Field Characterization around Small Bodies
NASA Astrophysics Data System (ADS)
Takahashi, Yu
A small body rendezvous mission requires accurate gravity field characterization for safe, accurate navigation purposes. However, the current techniques of gravity field modeling around small bodies are not achieved to the level of satisfaction. This thesis will address how the process of current gravity field characterization can be made more robust for future small body missions. First we perform the covariance analysis around small bodies via multiple slow flybys. Flyby characterization requires less laborious scheduling than its orbit counterpart, simultaneously reducing the risk of impact into the asteroid's surface. It will be shown that the level of initial characterization that can occur with this approach is no less than the orbit approach. Next, we apply the same technique of gravity field characterization to estimate the spin state of 4179 Touatis, which is a near-Earth asteroid in close to 4:1 resonance with the Earth. The data accumulated from 1992-2008 are processed in a least-squares filter to predict Toutatis' orientation during the 2012 apparition. The center-of-mass offset and the moments of inertia estimated thereof can be used to constrain the internal density distribution within the body. Then, the spin state estimation is developed to a generalized method to estimate the internal density distribution within a small body. The density distribution is estimated from the orbit determination solution of the gravitational coefficients. It will be shown that the surface gravity field reconstructed from the estimated density distribution yields higher accuracy than the conventional gravity field models. Finally, we will investigate two types of relatively unknown gravity fields, namely the interior gravity field and interior spherical Bessel gravity field, in order to investigate how accurately the surface gravity field can be mapped out for proximity operations purposes. It will be shown that these formulations compute the surface gravity field with unprecedented accuracy for a well-chosen set of parametric settings, both regionally and globally.
NASA Astrophysics Data System (ADS)
Contreras Quintana, S. H.; Werne, J. P.; Brown, E. T.; Halbur, J.; Sinninghe Damsté, , J.; Schouten, S.; Correa-Metrio, A.; Fawcett, P. J.
2014-12-01
Branched glycerol dialkyl glycerol tetraethers (GDGTs) are recently discovered bacterial membrane lipids, ubiquitously present in peat bogs and soils, as well as in rivers, lakes and lake sediments. Their distribution appears to be controlled mainly by soil pH and annual mean air temperature (MAT) and they have been increasingly used as paleoclimate proxies in sedimentary records. In order to validate their application as paleoclimate proxies, it is essential evaluate the influence of small scale environmental variability on their distribution. Initial application of the original soil-based branched GDGT distribution proxy to lacustrine sediments from Valles Caldera, New Mexico (NM) was promising, producing a viable temperature record spanning two glacial/interglacial cycles. In this study, we assess the influence of analytical and spatial soil heterogeneity on the concentration and distribution of 9 branched GDGTs in soils from Valles Caldera, and show how this variability is propagated to MAT and pH estimates using multiple soil-based branched GDGT transfer functions. Our results show that significant differences in the abundance and distribution of branched GDGTs in soil can be observed even within a small area such as Valles Caldera. Although the original MBT-CBT calibration appears to give robust MAT estimates and the newest calibration provides pH estimates in better agreement with modern local soils in Valles Caldera, the environmental heterogeneity (e.g. vegetation type and soil moisture) appears to affect the precision of MAT and pH estimates. Furthermore, the heterogeneity of soils leads to significant variability among samples taken even from within a square meter. While such soil heterogeneity is not unknown (and is typically controlled for by combining multiple samples), this study quantifies heterogeneity relative to branched GDGT-based proxies for the first time, indicating that care must be taken with samples from heterogeneous soils in MAT and pH reconstructions.
Nichols, James D.; Pollock, Kenneth H.; Hines, James E.
1984-01-01
The robust design of Pollock (1982) was used to estimate parameters of a Maryland M. pennsylvanicus population. Closed model tests provided strong evidence of heterogeneity of capture probability, and model M eta (Otis et al., 1978) was selected as the most appropriate model for estimating population size. The Jolly-Seber model goodness-of-fit test indicated rejection of the model for this data set, and the M eta estimates of population size were all higher than the Jolly-Seber estimates. Both of these results are consistent with the evidence of heterogeneous capture probabilities. The authors thus used M eta estimates of population size, Jolly-Seber estimates of survival rate, and estimates of birth-immigration based on a combination of the population size and survival rate estimates. Advantages of the robust design estimates for certain inference procedures are discussed, and the design is recommended for future small mammal capture-recapture studies directed at estimation.
A robust vision-based sensor fusion approach for real-time pose estimation.
Assa, Akbar; Janabi-Sharifi, Farrokh
2014-02-01
Object pose estimation is of great importance to many applications, such as augmented reality, localization and mapping, motion capture, and visual servoing. Although many approaches based on a monocular camera have been proposed, only a few works have concentrated on applying multicamera sensor fusion techniques to pose estimation. Higher accuracy and enhanced robustness toward sensor defects or failures are some of the advantages of these schemes. This paper presents a new Kalman-based sensor fusion approach for pose estimation that offers higher accuracy and precision, and is robust to camera motion and image occlusion, compared to its predecessors. Extensive experiments are conducted to validate the superiority of this fusion method over currently employed vision-based pose estimation algorithms.
NASA Astrophysics Data System (ADS)
Hajdu, Gergely; Dékány, István; Catelan, Márcio; Grebel, Eva K.; Jurcsik, Johanna
2018-04-01
RR Lyrae variables are widely used tracers of Galactic halo structure and kinematics, but they can also serve to constrain the distribution of the old stellar population in the Galactic bulge. With the aim of improving their near-infrared photometric characterization, we investigate their near-infrared light curves, as well as the empirical relationships between their light curve and metallicities using machine learning methods. We introduce a new, robust method for the estimation of the light-curve shapes, hence the average magnitudes of RR Lyrae variables in the K S band, by utilizing the first few principal components (PCs) as basis vectors, obtained from the PC analysis of a training set of light curves. Furthermore, we use the amplitudes of these PCs to predict the light-curve shape of each star in the J-band, allowing us to precisely determine their average magnitudes (hence colors), even in cases where only one J measurement is available. Finally, we demonstrate that the K S-band light-curve parameters of RR Lyrae variables, together with the period, allow the estimation of the metallicity of individual stars with an accuracy of ∼0.2–0.25 dex, providing valuable chemical information about old stellar populations bearing RR Lyrae variables. The methods presented here can be straightforwardly adopted for other classes of variable stars, bands, or for the estimation of other physical quantities.
Emergent sensing of complex environments by mobile animal groups.
Berdahl, Andrew; Torney, Colin J; Ioannou, Christos C; Faria, Jolyon J; Couzin, Iain D
2013-02-01
The capacity for groups to exhibit collective intelligence is an often-cited advantage of group living. Previous studies have shown that social organisms frequently benefit from pooling imperfect individual estimates. However, in principle, collective intelligence may also emerge from interactions between individuals, rather than from the enhancement of personal estimates. Here, we reveal that this emergent problem solving is the predominant mechanism by which a mobile animal group responds to complex environmental gradients. Robust collective sensing arises at the group level from individuals modulating their speed in response to local, scalar, measurements of light and through social interaction with others. This distributed sensing requires only rudimentary cognition and thus could be widespread across biological taxa, in addition to being appropriate and cost-effective for robotic agents.
Robust Regression for Slope Estimation in Curriculum-Based Measurement Progress Monitoring
ERIC Educational Resources Information Center
Mercer, Sterett H.; Lyons, Alina F.; Johnston, Lauren E.; Millhoff, Courtney L.
2015-01-01
Although ordinary least-squares (OLS) regression has been identified as a preferred method to calculate rates of improvement for individual students during curriculum-based measurement (CBM) progress monitoring, OLS slope estimates are sensitive to the presence of extreme values. Robust estimators have been developed that are less biased by…
Multiple-Beam Detection of Fast Transient Radio Sources
NASA Technical Reports Server (NTRS)
Thompson, David R.; Wagstaff, Kiri L.; Majid, Walid A.
2011-01-01
A method has been designed for using multiple independent stations to discriminate fast transient radio sources from local anomalies, such as antenna noise or radio frequency interference (RFI). This can improve the sensitivity of incoherent detection for geographically separated stations such as the very long baseline array (VLBA), the future square kilometer array (SKA), or any other coincident observations by multiple separated receivers. The transients are short, broadband pulses of radio energy, often just a few milliseconds long, emitted by a variety of exotic astronomical phenomena. They generally represent rare, high-energy events making them of great scientific value. For RFI-robust adaptive detection of transients, using multiple stations, a family of algorithms has been developed. The technique exploits the fact that the separated stations constitute statistically independent samples of the target. This can be used to adaptively ignore RFI events for superior sensitivity. If the antenna signals are independent and identically distributed (IID), then RFI events are simply outlier data points that can be removed through robust estimation such as a trimmed or Winsorized estimator. The alternative "trimmed" estimator is considered, which excises the strongest n signals from the list of short-beamed intensities. Because local RFI is independent at each antenna, this interference is unlikely to occur at many antennas on the same step. Trimming the strongest signals provides robustness to RFI that can theoretically outperform even the detection performance of the same number of antennas at a single site. This algorithm requires sorting the signals at each time step and dispersion measure, an operation that is computationally tractable for existing array sizes. An alternative uses the various stations to form an ensemble estimate of the conditional density function (CDF) evaluated at each time step. Both methods outperform standard detection strategies on a test sequence of VLBA data, and both are efficient enough for deployment in real-time, online transient detection applications.
NASA Astrophysics Data System (ADS)
Caulton, D.; Golston, L.; Li, Q.; Bou-Zeid, E.; Pan, D.; Lane, H.; Lu, J.; Fitts, J. P.; Zondlo, M. A.
2015-12-01
Recent work suggests the distribution of methane emissions from fracking operations is a skewed distributed with a small percentage of emitters contributing a large proportion of the total emissions. In order to provide a statistically robust distributions of emitters and determine the presence of super-emitters, errors in current techniques need to be constrained and mitigated. The Marcellus shale, the most productive natural gas shale field in the United States, has received less intense focus for well-level emissions and is here investigated to provide the distribution of methane emissions. In July of 2015 approximately 250 unique well pads were sampled using the Princeton Atmospheric Chemistry Mobile Acquisition Node (PAC-MAN). This mobile lab includes a Garmin GPS unit, Vaisala weather station (WTX520), LICOR 7700 CH4 open path sensor and LICOR 7500 CO2/H2O open path sensor. Sampling sites were preselected based on wind direction, sampling distance and elevation grade. All sites were sampled during low boundary layer conditions (600-1000 and 1800-2200 local time). The majority of sites were sampled 1-3 times while selected test sites were sampled multiple times or resampled several times during the day. For selected sites a sampling tower was constructed consisting of a Metek uSonic-3 Class A sonic anemometer, and an additional LICOR 7700 and 7500. Data were recorded for at least one hour at these sites. A robust study and inter-comparison of different methodologies will be presented. The Gaussian plume model will be used to calculate fluxes for all sites and compare results from test sites with multiple passes. Tower data is used to provide constraints on the Gaussian plume model. Additionally, Large Eddy Simulation (LES) modeling will be used to calculate emissions from the tower sites. Alternative techniques will also be discussed. Results from these techniques will be compared to identify best practices and provide robust error estimates.
Software For Least-Squares And Robust Estimation
NASA Technical Reports Server (NTRS)
Jeffreys, William H.; Fitzpatrick, Michael J.; Mcarthur, Barbara E.; Mccartney, James
1990-01-01
GAUSSFIT computer program includes full-featured programming language facilitating creation of mathematical models solving least-squares and robust-estimation problems. Programming language designed to make it easy to specify complex reduction models. Written in 100 percent C language.
Rank-preserving regression: a more robust rank regression model against outliers.
Chen, Tian; Kowalski, Jeanne; Chen, Rui; Wu, Pan; Zhang, Hui; Feng, Changyong; Tu, Xin M
2016-08-30
Mean-based semi-parametric regression models such as the popular generalized estimating equations are widely used to improve robustness of inference over parametric models. Unfortunately, such models are quite sensitive to outlying observations. The Wilcoxon-score-based rank regression (RR) provides more robust estimates over generalized estimating equations against outliers. However, the RR and its extensions do not sufficiently address missing data arising in longitudinal studies. In this paper, we propose a new approach to address outliers under a different framework based on the functional response models. This functional-response-model-based alternative not only addresses limitations of the RR and its extensions for longitudinal data, but, with its rank-preserving property, even provides more robust estimates than these alternatives. The proposed approach is illustrated with both real and simulated data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van den Heuvel, F; Hackett, S; Fiorini, F
Purpose: Currently, planning systems allow robustness calculations to be performed, but a generalized assessment methodology is not yet available. We introduce and evaluate a methodology to quantify the robustness of a plan on an individual patient basis. Methods: We introduce the notion of characterizing a treatment instance (i.e. one single fraction delivery) by describing the dose distribution within an organ as an alpha-stable distribution. The parameters of the distribution (shape(α), scale(γ), position(δ), and symmetry(β)), will vary continuously (in a mathematical sense) as the distributions change with the different positions. The rate of change of the parameters provides a measure ofmore » the robustness of the treatment. The methodology is tested in a planning study of 25 patients with known residual errors at each fraction. Each patient was planned using Eclipse with an IBA-proton beam model. The residual error space for every patient was sampled 30 times, yielding 31 treatment plans for each patient and dose distributions in 5 organs. The parameters’ change rate as a function of Euclidean distance from the original plan was analyzed. Results: More than 1,000 dose distributions were analyzed. For 4 of the 25 patients the change in scale rate (γ) was considerably higher than the lowest change rate, indicating a lack of robustness. The sign of the shape change rate (α) also seemed indicative but the experiment lacked the power to prove significance. Conclusion: There are indications that this robustness measure is a valuable tool to allow a more patient individualized approach to the determination of margins. In a further study we will also evaluate this robustness measure using photon treatments, and evaluate the impact of using breath hold techniques, and the a Monte Carlo based dose deposition calculation. A principle component analysis is also planned.« less
Aeroservoelastic Uncertainty Model Identification from Flight Data
NASA Technical Reports Server (NTRS)
Brenner, Martin J.
2001-01-01
Uncertainty modeling is a critical element in the estimation of robust stability margins for stability boundary prediction and robust flight control system development. There has been a serious deficiency to date in aeroservoelastic data analysis with attention to uncertainty modeling. Uncertainty can be estimated from flight data using both parametric and nonparametric identification techniques. The model validation problem addressed in this paper is to identify aeroservoelastic models with associated uncertainty structures from a limited amount of controlled excitation inputs over an extensive flight envelope. The challenge to this problem is to update analytical models from flight data estimates while also deriving non-conservative uncertainty descriptions consistent with the flight data. Multisine control surface command inputs and control system feedbacks are used as signals in a wavelet-based modal parameter estimation procedure for model updates. Transfer function estimates are incorporated in a robust minimax estimation scheme to get input-output parameters and error bounds consistent with the data and model structure. Uncertainty estimates derived from the data in this manner provide an appropriate and relevant representation for model development and robust stability analysis. This model-plus-uncertainty identification procedure is applied to aeroservoelastic flight data from the NASA Dryden Flight Research Center F-18 Systems Research Aircraft.
Estimating open population site occupancy from presence-absence data lacking the robust design.
Dail, D; Madsen, L
2013-03-01
Many animal monitoring studies seek to estimate the proportion of a study area occupied by a target population. The study area is divided into spatially distinct sites where the detected presence or absence of the population is recorded, and this is repeated in time for multiple seasons. However, when occupied sites are detected with probability p < 1, the lack of a detection does not imply lack of occupancy. MacKenzie et al. (2003, Ecology 84, 2200-2207) developed a multiseason model for estimating seasonal site occupancy (ψt ) while accounting for unknown p. Their model performs well when observations are collected according to the robust design, where multiple sampling occasions occur during each season; the repeated sampling aids in the estimation p. However, their model does not perform as well when the robust design is lacking. In this paper, we propose an alternative likelihood model that yields improved seasonal estimates of p and Ψt in the absence of the robust design. We construct the marginal likelihood of the observed data by conditioning on, and summing out, the latent number of occupied sites during each season. A simulation study shows that in cases without the robust design, the proposed model estimates p with less bias than the MacKenzie et al. model and hence improves the estimates of Ψt . We apply both models to a data set consisting of repeated presence-absence observations of American robins (Turdus migratorius) with yearly survey periods. The two models are compared to a third estimator available when the repeated counts (from the same study) are considered, with the proposed model yielding estimates of Ψt closest to estimates from the point count model. Copyright © 2013, The International Biometric Society.
Momentum Flux Determination Using the Multi-beam Poker Flat Incoherent Scatter Radar
NASA Technical Reports Server (NTRS)
Nicolls, M. J.; Fritts, D. C.; Janches, Diego; Heinselman, C. J.
2012-01-01
In this paper, we develop an estimator for the vertical flux of horizontal momentum with arbitrary beam pointing, applicable to the case of arbitrary but fixed beam pointing with systems such as the Poker Flat Incoherent Scatter Radar (PFISR). This method uses information from all available beams to resolve the variances of the wind field in addition to the vertical flux of both meridional and zonal momentum, targeted for high-frequency wave motions. The estimator utilises the full covariance of the distributed measurements, which provides a significant reduction in errors over the direct extension of previously developed techniques and allows for the calculation of an error covariance matrix of the estimated quantities. We find that for the PFISR experiment, we can construct an unbiased and robust estimator of the momentum flux if sufficient and proper beam orientations are chosen, which can in the future be optimized for the expected frequency distribution of momentum-containing scales. However, there is a potential trade-off between biases and standard errors introduced with the new approach, which must be taken into account when assessing the momentum fluxes. We apply the estimator to PFISR measurements on 23 April 2008 and 21 December 2007, from 60-85 km altitude, and show expected results as compared to mean winds and in relation to the measured vertical velocity variances.
Wu, Zhijin; Liu, Dongmei; Sui, Yunxia
2008-02-01
The process of identifying active targets (hits) in high-throughput screening (HTS) usually involves 2 steps: first, removing or adjusting for systematic variation in the measurement process so that extreme values represent strong biological activity instead of systematic biases such as plate effect or edge effect and, second, choosing a meaningful cutoff on the calculated statistic to declare positive compounds. Both false-positive and false-negative errors are inevitable in this process. Common control or estimation of error rates is often based on an assumption of normal distribution of the noise. The error rates in hit detection, especially false-negative rates, are hard to verify because in most assays, only compounds selected in primary screening are followed up in confirmation experiments. In this article, the authors take advantage of a quantitative HTS experiment in which all compounds are tested 42 times over a wide range of 14 concentrations so true positives can be found through a dose-response curve. Using the activity status defined by dose curve, the authors analyzed the effect of various data-processing procedures on the sensitivity and specificity of hit detection, the control of error rate, and hit confirmation. A new summary score is proposed and demonstrated to perform well in hit detection and useful in confirmation rate estimation. In general, adjusting for positional effects is beneficial, but a robust test can prevent overadjustment. Error rates estimated based on normal assumption do not agree with actual error rates, for the tails of noise distribution deviate from normal distribution. However, false discovery rate based on empirically estimated null distribution is very close to observed false discovery proportion.
Effect size measures in a two-independent-samples case with nonnormal and nonhomogeneous data.
Li, Johnson Ching-Hong
2016-12-01
In psychological science, the "new statistics" refer to the new statistical practices that focus on effect size (ES) evaluation instead of conventional null-hypothesis significance testing (Cumming, Psychological Science, 25, 7-29, 2014). In a two-independent-samples scenario, Cohen's (1988) standardized mean difference (d) is the most popular ES, but its accuracy relies on two assumptions: normality and homogeneity of variances. Five other ESs-the unscaled robust d (d r * ; Hogarty & Kromrey, 2001), scaled robust d (d r ; Algina, Keselman, & Penfield, Psychological Methods, 10, 317-328, 2005), point-biserial correlation (r pb ; McGrath & Meyer, Psychological Methods, 11, 386-401, 2006), common-language ES (CL; Cliff, Psychological Bulletin, 114, 494-509, 1993), and nonparametric estimator for CL (A w ; Ruscio, Psychological Methods, 13, 19-30, 2008)-may be robust to violations of these assumptions, but no study has systematically evaluated their performance. Thus, in this simulation study the performance of these six ESs was examined across five factors: data distribution, sample, base rate, variance ratio, and sample size. The results showed that A w and d r were generally robust to these violations, and A w slightly outperformed d r . Implications for the use of A w and d r in real-world research are discussed.
Wakefield, Ewan D; Owen, Ellie; Baer, Julia; Carroll, Matthew J; Daunt, Francis; Dodd, Stephen G; Green, Jonathan A; Guilford, Tim; Mavor, Roddy A; Miller, Peter I; Newell, Mark A; Newton, Stephen F; Robertson, Gail S; Shoji, Akiko; Soanes, Louise M; Votier, Stephen C; Wanless, Sarah; Bolton, Mark
2017-10-01
Population-level estimates of species' distributions can reveal fundamental ecological processes and facilitate conservation. However, these may be difficult to obtain for mobile species, especially colonial central-place foragers (CCPFs; e.g., bats, corvids, social insects), because it is often impractical to determine the provenance of individuals observed beyond breeding sites. Moreover, some CCPFs, especially in the marine realm (e.g., pinnipeds, turtles, and seabirds) are difficult to observe because they range tens to ten thousands of kilometers from their colonies. It is hypothesized that the distribution of CCPFs depends largely on habitat availability and intraspecific competition. Modeling these effects may therefore allow distributions to be estimated from samples of individual spatial usage. Such data can be obtained for an increasing number of species using tracking technology. However, techniques for estimating population-level distributions using the telemetry data are poorly developed. This is of concern because many marine CCPFs, such as seabirds, are threatened by anthropogenic activities. Here, we aim to estimate the distribution at sea of four seabird species, foraging from approximately 5,500 breeding sites in Britain and Ireland. To do so, we GPS-tracked a sample of 230 European Shags Phalacrocorax aristotelis, 464 Black-legged Kittiwakes Rissa tridactyla, 178 Common Murres Uria aalge, and 281 Razorbills Alca torda from 13, 20, 12, and 14 colonies, respectively. Using Poisson point process habitat use models, we show that distribution at sea is dependent on (1) density-dependent competition among sympatric conspecifics (all species) and parapatric conspecifics (Kittiwakes and Murres); (2) habitat accessibility and coastal geometry, such that birds travel further from colonies with limited access to the sea; and (3) regional habitat availability. Using these models, we predict space use by birds from unobserved colonies and thereby map the distribution at sea of each species at both the colony and regional level. Space use by all four species' British breeding populations is concentrated in the coastal waters of Scotland, highlighting the need for robust conservation measures in this area. The techniques we present are applicable to any CCPF. © 2017 by the Ecological Society of America.
Liu, Hesheng; Schimpf, Paul H; Dong, Guoya; Gao, Xiaorong; Yang, Fusheng; Gao, Shangkai
2005-10-01
This paper presents a new algorithm called Standardized Shrinking LORETA-FOCUSS (SSLOFO) for solving the electroencephalogram (EEG) inverse problem. Multiple techniques are combined in a single procedure to robustly reconstruct the underlying source distribution with high spatial resolution. This algorithm uses a recursive process which takes the smooth estimate of sLORETA as initialization and then employs the re-weighted minimum norm introduced by FOCUSS. An important technique called standardization is involved in the recursive process to enhance the localization ability. The algorithm is further improved by automatically adjusting the source space according to the estimate of the previous step, and by the inclusion of temporal information. Simulation studies are carried out on both spherical and realistic head models. The algorithm achieves very good localization ability on noise-free data. It is capable of recovering complex source configurations with arbitrary shapes and can produce high quality images of extended source distributions. We also characterized the performance with noisy data in a realistic head model. An important feature of this algorithm is that the temporal waveforms are clearly reconstructed, even for closely spaced sources. This provides a convenient way to estimate neural dynamics directly from the cortical sources.
Experimental design for dynamics identification of cellular processes.
Dinh, Vu; Rundell, Ann E; Buzzard, Gregery T
2014-03-01
We address the problem of using nonlinear models to design experiments to characterize the dynamics of cellular processes by using the approach of the Maximally Informative Next Experiment (MINE), which was introduced in W. Dong et al. (PLoS ONE 3(8):e3105, 2008) and independently in M.M. Donahue et al. (IET Syst. Biol. 4:249-262, 2010). In this approach, existing data is used to define a probability distribution on the parameters; the next measurement point is the one that yields the largest model output variance with this distribution. Building upon this approach, we introduce the Expected Dynamics Estimator (EDE), which is the expected value using this distribution of the output as a function of time. We prove the consistency of this estimator (uniform convergence to true dynamics) even when the chosen experiments cluster in a finite set of points. We extend this proof of consistency to various practical assumptions on noisy data and moderate levels of model mismatch. Through the derivation and proof, we develop a relaxed version of MINE that is more computationally tractable and robust than the original formulation. The results are illustrated with numerical examples on two nonlinear ordinary differential equation models of biomolecular and cellular processes.
NASA Astrophysics Data System (ADS)
Yin, Hui; Yu, Dejie; Yin, Shengwen; Xia, Baizhan
2018-03-01
The conventional engineering optimization problems considering uncertainties are based on the probabilistic model. However, the probabilistic model may be unavailable because of the lack of sufficient objective information to construct the precise probability distribution of uncertainties. This paper proposes a possibility-based robust design optimization (PBRDO) framework for the uncertain structural-acoustic system based on the fuzzy set model, which can be constructed by expert opinions. The objective of robust design is to optimize the expectation and variability of system performance with respect to uncertainties simultaneously. In the proposed PBRDO, the entropy of the fuzzy system response is used as the variability index; the weighted sum of the entropy and expectation of the fuzzy response is used as the objective function, and the constraints are established in the possibility context. The computations for the constraints and objective function of PBRDO are a triple-loop and a double-loop nested problem, respectively, whose computational costs are considerable. To improve the computational efficiency, the target performance approach is introduced to transform the calculation of the constraints into a double-loop nested problem. To further improve the computational efficiency, a Chebyshev fuzzy method (CFM) based on the Chebyshev polynomials is proposed to estimate the objective function, and the Chebyshev interval method (CIM) is introduced to estimate the constraints, thereby the optimization problem is transformed into a single-loop one. Numerical results on a shell structural-acoustic system verify the effectiveness and feasibility of the proposed methods.
Robust and fast pedestrian detection method for far-infrared automotive driving assistance systems
NASA Astrophysics Data System (ADS)
Liu, Qiong; Zhuang, Jiajun; Ma, Jun
2013-09-01
Despite considerable effort has been contributed to night-time pedestrian detection for automotive driving assistance systems recent years, robust and real-time pedestrian detection is by no means a trivial task and is still underway due to the moving cameras, uncontrolled outdoor environments, wide range of possible pedestrian presentations and the stringent performance criteria for automotive applications. This paper presents an alternative night-time pedestrian detection method using monocular far-infrared (FIR) camera, which includes two modules (regions of interest (ROIs) generation and pedestrian recognition) in a cascade fashion. Pixel-gradient oriented vertical projection is first proposed to estimate the vertical image stripes that might contain pedestrians, and then local thresholding image segmentation is adopted to generate ROIs more accurately within the estimated vertical stripes. A novel descriptor called PEWHOG (pyramid entropy weighted histograms of oriented gradients) is proposed to represent FIR pedestrians in recognition module. Specifically, PEWHOG is used to capture both the local object shape described by the entropy weighted distribution of oriented gradient histograms and its pyramid spatial layout. Then PEWHOG is fed to a three-branch structured classifier using support vector machines (SVM) with histogram intersection kernel (HIK). An off-line training procedure combining both the bootstrapping and early-stopping strategy is introduced to generate a more robust classifier by exploiting hard negative samples iteratively. Finally, multi-frame validation is utilized to suppress some transient false positives. Experimental results on FIR video sequences from various scenarios demonstrate that the presented method is effective and promising.
NASA Astrophysics Data System (ADS)
Lindner, Robert; Lou, Xinghua; Reinstein, Jochen; Shoeman, Robert L.; Hamprecht, Fred A.; Winkler, Andreas
2014-06-01
Hydrogen-deuterium exchange (HDX) experiments analyzed by mass spectrometry (MS) provide information about the dynamics and the solvent accessibility of protein backbone amide hydrogen atoms. Continuous improvement of MS instrumentation has contributed to the increasing popularity of this method; however, comprehensive automated data analysis is only beginning to mature. We present Hexicon 2, an automated pipeline for data analysis and visualization based on the previously published program Hexicon (Lou et al. 2010). Hexicon 2 employs the sensitive NITPICK peak detection algorithm of its predecessor in a divide-and-conquer strategy and adds new features, such as chromatogram alignment and improved peptide sequence assignment. The unique feature of deuteration distribution estimation was retained in Hexicon 2 and improved using an iterative deconvolution algorithm that is robust even to noisy data. In addition, Hexicon 2 provides a data browser that facilitates quality control and provides convenient access to common data visualization tasks. Analysis of a benchmark dataset demonstrates superior performance of Hexicon 2 compared with its predecessor in terms of deuteration centroid recovery and deuteration distribution estimation. Hexicon 2 greatly reduces data analysis time compared with manual analysis, whereas the increased number of peptides provides redundant coverage of the entire protein sequence. Hexicon 2 is a standalone application available free of charge under http://hx2.mpimf-heidelberg.mpg.de.
Kong, Mei-Fung; Chan, Serena; Wong, Yiu-Chung
2008-01-01
The proficiency testing (PT) program for 97 worldwide laboratories for determining total arsenic, cadmium, and lead in seawater shrimp under the auspices of the Asia-Pacific Laboratory Accreditation Cooperation (APLAC) is discussed. The program is one of the APLAC PT series whose primary purposes are to establish mutual agreement on the equivalence of the operation of APLAC member laboratories and to take corrective actions if testing deficiencies are identified. Pooled data for Cd and Pb were normally distributed with interlaboratory variations of 21.9 and 34.8%, respectively. The corresponding consensus mean values estimated by robust statistics were in good agreement with those obtained in the homogeneity tests. However, a bimodal distribution was observed from the determination of total As, in which 14 out of 74 participants reported much smaller values (0.482-6.4 mg/kg) as compared with the mean values of 60.9 mg/kg in the homogeneity test. The use of consensus mean is known to have significant deviation from the true value in bi- or multimodal distribution. Therefore, the mode value, a better estimate of central tendency, was chosen to assess participants' performance for total As. Estimates of the overall uncertainty from participants varied in this program, and some were recommended to acquire more comprehensive exposure toward important criteria as stipulated in ISO/IEC 17025.
Lindner, Robert; Lou, Xinghua; Reinstein, Jochen; Shoeman, Robert L; Hamprecht, Fred A; Winkler, Andreas
2014-06-01
Hydrogen-deuterium exchange (HDX) experiments analyzed by mass spectrometry (MS) provide information about the dynamics and the solvent accessibility of protein backbone amide hydrogen atoms. Continuous improvement of MS instrumentation has contributed to the increasing popularity of this method; however, comprehensive automated data analysis is only beginning to mature. We present Hexicon 2, an automated pipeline for data analysis and visualization based on the previously published program Hexicon (Lou et al. 2010). Hexicon 2 employs the sensitive NITPICK peak detection algorithm of its predecessor in a divide-and-conquer strategy and adds new features, such as chromatogram alignment and improved peptide sequence assignment. The unique feature of deuteration distribution estimation was retained in Hexicon 2 and improved using an iterative deconvolution algorithm that is robust even to noisy data. In addition, Hexicon 2 provides a data browser that facilitates quality control and provides convenient access to common data visualization tasks. Analysis of a benchmark dataset demonstrates superior performance of Hexicon 2 compared with its predecessor in terms of deuteration centroid recovery and deuteration distribution estimation. Hexicon 2 greatly reduces data analysis time compared with manual analysis, whereas the increased number of peptides provides redundant coverage of the entire protein sequence. Hexicon 2 is a standalone application available free of charge under http://hx2.mpimf-heidelberg.mpg.de.
NASA Astrophysics Data System (ADS)
Bowman, Christopher; Haith, Gary; Steinberg, Alan; Morefield, Charles; Morefield, Michael
2013-05-01
This paper describes methods to affordably improve the robustness of distributed fusion systems by opportunistically leveraging non-traditional data sources. Adaptive methods help find relevant data, create models, and characterize the model quality. These methods also can measure the conformity of this non-traditional data with fusion system products including situation modeling and mission impact prediction. Non-traditional data can improve the quantity, quality, availability, timeliness, and diversity of the baseline fusion system sources and therefore can improve prediction and estimation accuracy and robustness at all levels of fusion. Techniques are described that automatically learn to characterize and search non-traditional contextual data to enable operators integrate the data with the high-level fusion systems and ontologies. These techniques apply the extension of the Data Fusion & Resource Management Dual Node Network (DNN) technical architecture at Level 4. The DNN architecture supports effectively assessment and management of the expanded portfolio of data sources, entities of interest, models, and algorithms including data pattern discovery and context conformity. Affordable model-driven and data-driven data mining methods to discover unknown models from non-traditional and `big data' sources are used to automatically learn entity behaviors and correlations with fusion products, [14 and 15]. This paper describes our context assessment software development, and the demonstration of context assessment of non-traditional data to compare to an intelligence surveillance and reconnaissance fusion product based upon an IED POIs workflow.
Scott, JoAnna M; deCamp, Allan; Juraska, Michal; Fay, Michael P; Gilbert, Peter B
2017-04-01
Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.
Pierrillas, Philippe B; Tod, Michel; Amiel, Magali; Chenel, Marylore; Henin, Emilie
2016-09-01
The purpose of this study was to explore the impact of censoring due to animal sacrifice on parameter estimates and tumor volume calculated from two diameters in larger tumors during tumor growth experiments in preclinical studies. The type of measurement error that can be expected was also investigated. Different scenarios were challenged using the stochastic simulation and estimation process. One thousand datasets were simulated under the design of a typical tumor growth study in xenografted mice, and then, eight approaches were used for parameter estimation with the simulated datasets. The distribution of estimates and simulation-based diagnostics were computed for comparison. The different approaches were robust regarding the choice of residual error and gave equivalent results. However, by not considering missing data induced by sacrificing the animal, parameter estimates were biased and led to false inferences in terms of compound potency; the threshold concentration for tumor eradication when ignoring censoring was 581 ng.ml(-1), but the true value was 240 ng.ml(-1).
NASA Astrophysics Data System (ADS)
Zhang, Daili
Increasing societal demand for automation has led to considerable efforts to control large-scale complex systems, especially in the area of autonomous intelligent control methods. The control system of a large-scale complex system needs to satisfy four system level requirements: robustness, flexibility, reusability, and scalability. Corresponding to the four system level requirements, there arise four major challenges. First, it is difficult to get accurate and complete information. Second, the system may be physically highly distributed. Third, the system evolves very quickly. Fourth, emergent global behaviors of the system can be caused by small disturbances at the component level. The Multi-Agent Based Control (MABC) method as an implementation of distributed intelligent control has been the focus of research since the 1970s, in an effort to solve the above-mentioned problems in controlling large-scale complex systems. However, to the author's best knowledge, all MABC systems for large-scale complex systems with significant uncertainties are problem-specific and thus difficult to extend to other domains or larger systems. This situation is partly due to the control architecture of multiple agents being determined by agent to agent coupling and interaction mechanisms. Therefore, the research objective of this dissertation is to develop a comprehensive, generalized framework for the control system design of general large-scale complex systems with significant uncertainties, with the focus on distributed control architecture design and distributed inference engine design. A Hybrid Multi-Agent Based Control (HyMABC) architecture is proposed by combining hierarchical control architecture and module control architecture with logical replication rings. First, it decomposes a complex system hierarchically; second, it combines the components in the same level as a module, and then designs common interfaces for all of the components in the same module; third, replications are made for critical agents and are organized into logical rings. This architecture maintains clear guidelines for complexity decomposition and also increases the robustness of the whole system. Multiple Sectioned Dynamic Bayesian Networks (MSDBNs) as a distributed dynamic probabilistic inference engine, can be embedded into the control architecture to handle uncertainties of general large-scale complex systems. MSDBNs decomposes a large knowledge-based system into many agents. Each agent holds its partial perspective of a large problem domain by representing its knowledge as a Dynamic Bayesian Network (DBN). Each agent accesses local evidence from its corresponding local sensors and communicates with other agents through finite message passing. If the distributed agents can be organized into a tree structure, satisfying the running intersection property and d-sep set requirements, globally consistent inferences are achievable in a distributed way. By using different frequencies for local DBN agent belief updating and global system belief updating, it balances the communication cost with the global consistency of inferences. In this dissertation, a fully factorized Boyen-Koller (BK) approximation algorithm is used for local DBN agent belief updating, and the static Junction Forest Linkage Tree (JFLT) algorithm is used for global system belief updating. MSDBNs assume a static structure and a stable communication network for the whole system. However, for a real system, sub-Bayesian networks as nodes could be lost, and the communication network could be shut down due to partial damage in the system. Therefore, on-line and automatic MSDBNs structure formation is necessary for making robust state estimations and increasing survivability of the whole system. A Distributed Spanning Tree Optimization (DSTO) algorithm, a Distributed D-Sep Set Satisfaction (DDSSS) algorithm, and a Distributed Running Intersection Satisfaction (DRIS) algorithm are proposed in this dissertation. Combining these three distributed algorithms and a Distributed Belief Propagation (DBP) algorithm in MSDBNs makes state estimations robust to partial damage in the whole system. Combining the distributed control architecture design and the distributed inference engine design leads to a process of control system design for a general large-scale complex system. As applications of the proposed methodology, the control system design of a simplified ship chilled water system and a notional ship chilled water system have been demonstrated step by step. Simulation results not only show that the proposed methodology gives a clear guideline for control system design for general large-scale complex systems with dynamic and uncertain environment, but also indicate that the combination of MSDBNs and HyMABC can provide excellent performance for controlling general large-scale complex systems.
Disentangling niche competition from grazing mortality in phytoplankton dilution experiments
Weitz, Joshua S.
2017-01-01
The dilution method is the principal tool used to infer in situ microzooplankton grazing rates. However, grazing is the only mortality process considered in the theoretical model underlying the interpretation of dilution method experiments. Here we evaluate the robustness of mortality estimates inferred from dilution experiments when there is concurrent niche competition amongst phytoplankton. Using a combination of mathematical analysis and numerical simulations, we find that grazing rates may be overestimated—the degree of overestimation is related to the importance of niche competition relative to microzooplankton grazing. In response, we propose a conceptual method to disentangle the effects of niche competition and grazing by diluting out microzooplankton, but not phytoplankton. Our theoretical results suggest this revised “Z-dilution” method can robustly infer grazing mortality, regardless of the dominant phytoplankton mortality driver in our system. Further, we show it is possible to independently estimate both grazing mortality and niche competition if the classical and Z-dilution methods can be used in tandem. We discuss the significance of these results for quantifying phytoplankton mortality rates; and the feasibility of implementing the Z-dilution method in practice, whether in model systems or in complex communities with overlap in the size distributions of phytoplankton and microzooplankton. PMID:28505212
Model-based ultrasound temperature visualization during and following HIFU exposure.
Ye, Guoliang; Smith, Penny Probert; Noble, J Alison
2010-02-01
This paper describes the application of signal processing techniques to improve the robustness of ultrasound feedback for displaying changes in temperature distribution in treatment using high-intensity focused ultrasound (HIFU), especially at the low signal-to-noise ratios that might be expected in in vivo abdominal treatment. Temperature estimation is based on the local displacements in ultrasound images taken during HIFU treatment, and a method to improve robustness to outliers is introduced. The main contribution of the paper is in the application of a Kalman filter, a statistical signal processing technique, which uses a simple analytical temperature model of heat dispersion to improve the temperature estimation from the ultrasound measurements during and after HIFU exposure. To reduce the sensitivity of the method to previous assumptions on the material homogeneity and signal-to-noise ratio, an adaptive form is introduced. The method is illustrated using data from HIFU exposure of ex vivo bovine liver. A particular advantage of the stability it introduces is that the temperature can be visualized not only in the intervals between HIFU exposure but also, for some configurations, during the exposure itself. 2010 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
León, Larry F; Cai, Tianxi
2012-04-01
In this paper we develop model checking techniques for assessing functional form specifications of covariates in censored linear regression models. These procedures are based on a censored data analog to taking cumulative sums of "robust" residuals over the space of the covariate under investigation. These cumulative sums are formed by integrating certain Kaplan-Meier estimators and may be viewed as "robust" censored data analogs to the processes considered by Lin, Wei & Ying (2002). The null distributions of these stochastic processes can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be generated by computer simulation. Each observed process can then be graphically compared with a few realizations from the Gaussian process. We also develop formal test statistics for numerical comparison. Such comparisons enable one to assess objectively whether an apparent trend seen in a residual plot reects model misspecification or natural variation. We illustrate the methods with a well known dataset. In addition, we examine the finite sample performance of the proposed test statistics in simulation experiments. In our simulation experiments, the proposed test statistics have good power of detecting misspecification while at the same time controlling the size of the test.
Optimal Magnetic Sensor Vests for Cardiac Source Imaging
Lau, Stephan; Petković, Bojana; Haueisen, Jens
2016-01-01
Magnetocardiography (MCG) non-invasively provides functional information about the heart. New room-temperature magnetic field sensors, specifically magnetoresistive and optically pumped magnetometers, have reached sensitivities in the ultra-low range of cardiac fields while allowing for free placement around the human torso. Our aim is to optimize positions and orientations of such magnetic sensors in a vest-like arrangement for robust reconstruction of the electric current distributions in the heart. We optimized a set of 32 sensors on the surface of a torso model with respect to a 13-dipole cardiac source model under noise-free conditions. The reconstruction robustness was estimated by the condition of the lead field matrix. Optimization improved the condition of the lead field matrix by approximately two orders of magnitude compared to a regular array at the front of the torso. Optimized setups exhibited distributions of sensors over the whole torso with denser sampling above the heart at the front and back of the torso. Sensors close to the heart were arranged predominantly tangential to the body surface. The optimized sensor setup could facilitate the definition of a standard for sensor placement in MCG and the development of a wearable MCG vest for clinical diagnostics. PMID:27231910
Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting.
Wöllmer, Martin; Marchi, Erik; Squartini, Stefano; Schuller, Björn
2011-09-01
Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Kyri; Dall'Anese, Emiliano; Summers, Tyler
This paper outlines a data-driven, distributionally robust approach to solve chance-constrained AC optimal power flow problems in distribution networks. Uncertain forecasts for loads and power generated by photovoltaic (PV) systems are considered, with the goal of minimizing PV curtailment while meeting power flow and voltage regulation constraints. A data- driven approach is utilized to develop a distributionally robust conservative convex approximation of the chance-constraints; particularly, the mean and covariance matrix of the forecast errors are updated online, and leveraged to enforce voltage regulation with predetermined probability via Chebyshev-based bounds. By combining an accurate linear approximation of the AC power flowmore » equations with the distributionally robust chance constraint reformulation, the resulting optimization problem becomes convex and computationally tractable.« less
Correlation dimension and phase space contraction via extreme value theory
NASA Astrophysics Data System (ADS)
Faranda, Davide; Vaienti, Sandro
2018-04-01
We show how to obtain theoretical and numerical estimates of correlation dimension and phase space contraction by using the extreme value theory. The maxima of suitable observables sampled along the trajectory of a chaotic dynamical system converge asymptotically to classical extreme value laws where: (i) the inverse of the scale parameter gives the correlation dimension and (ii) the extremal index is associated with the rate of phase space contraction for backward iteration, which in dimension 1 and 2, is closely related to the positive Lyapunov exponent and in higher dimensions is related to the metric entropy. We call it the Dynamical Extremal Index. Numerical estimates are straightforward to obtain as they imply just a simple fit to a univariate distribution. Numerical tests range from low dimensional maps, to generalized Henon maps and climate data. The estimates of the indicators are particularly robust even with relatively short time series.
Robust quantum network architectures and topologies for entanglement distribution
NASA Astrophysics Data System (ADS)
Das, Siddhartha; Khatri, Sumeet; Dowling, Jonathan P.
2018-01-01
Entanglement distribution is a prerequisite for several important quantum information processing and computing tasks, such as quantum teleportation, quantum key distribution, and distributed quantum computing. In this work, we focus on two-dimensional quantum networks based on optical quantum technologies using dual-rail photonic qubits for the building of a fail-safe quantum internet. We lay out a quantum network architecture for entanglement distribution between distant parties using a Bravais lattice topology, with the technological constraint that quantum repeaters equipped with quantum memories are not easily accessible. We provide a robust protocol for simultaneous entanglement distribution between two distant groups of parties on this network. We also discuss a memory-based quantum network architecture that can be implemented on networks with an arbitrary topology. We examine networks with bow-tie lattice and Archimedean lattice topologies and use percolation theory to quantify the robustness of the networks. In particular, we provide figures of merit on the loss parameter of the optical medium that depend only on the topology of the network and quantify the robustness of the network against intermittent photon loss and intermittent failure of nodes. These figures of merit can be used to compare the robustness of different network topologies in order to determine the best topology in a given real-world scenario, which is critical in the realization of the quantum internet.
Image interpolation via regularized local linear regression.
Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang
2011-12-01
The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
Modelling the spatial distribution of ammonia emissions in the UK.
Hellsten, S; Dragosits, U; Place, C J; Vieno, M; Dore, A J; Misselbrook, T H; Tang, Y S; Sutton, M A
2008-08-01
Ammonia emissions (NH3) are characterised by a high spatial variability at a local scale. When modelling the spatial distribution of NH3 emissions, it is important to provide robust emission estimates, since the model output is used to assess potential environmental impacts, e.g. exceedance of critical loads. The aim of this study was to provide a new, updated spatial NH3 emission inventory for the UK for the year 2000, based on an improved modelling approach and the use of updated input datasets. The AENEID model distributes NH3 emissions from a range of agricultural activities, such as grazing and housing of livestock, storage and spreading of manures, and fertilizer application, at a 1-km grid resolution over the most suitable landcover types. The results of the emission calculation for the year 2000 are analysed and the methodology is compared with a previous spatial emission inventory for 1996.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ding, Fei; Ji, Haoran; Wang, Chengshan
Distributed generators (DGs) including photovoltaic panels (PVs) have been integrated dramatically in active distribution networks (ADNs). Due to the strong volatility and uncertainty, the high penetration of PV generation immensely exacerbates the conditions of voltage violation in ADNs. However, the emerging flexible interconnection technology based on soft open points (SOPs) provides increased controllability and flexibility to the system operation. For fully exploiting the regulation ability of SOPs to address the problems caused by PV, this paper proposes a robust optimization method to achieve the robust optimal operation of SOPs in ADNs. A two-stage adjustable robust optimization model is built tomore » tackle the uncertainties of PV outputs, in which robust operation strategies of SOPs are generated to eliminate the voltage violations and reduce the power losses of ADNs. A column-and-constraint generation (C&CG) algorithm is developed to solve the proposed robust optimization model, which are formulated as second-order cone program (SOCP) to facilitate the accuracy and computation efficiency. Case studies on the modified IEEE 33-node system and comparisons with the deterministic optimization approach are conducted to verify the effectiveness and robustness of the proposed method.« less
Mollah, Mohammad Manir Hossain; Jamal, Rahman; Mokhtar, Norfilza Mohd; Harun, Roslan; Mollah, Md. Nurul Haque
2015-01-01
Background Identifying genes that are differentially expressed (DE) between two or more conditions with multiple patterns of expression is one of the primary objectives of gene expression data analysis. Several statistical approaches, including one-way analysis of variance (ANOVA), are used to identify DE genes. However, most of these methods provide misleading results for two or more conditions with multiple patterns of expression in the presence of outlying genes. In this paper, an attempt is made to develop a hybrid one-way ANOVA approach that unifies the robustness and efficiency of estimation using the minimum β-divergence method to overcome some problems that arise in the existing robust methods for both small- and large-sample cases with multiple patterns of expression. Results The proposed method relies on a β-weight function, which produces values between 0 and 1. The β-weight function with β = 0.2 is used as a measure of outlier detection. It assigns smaller weights (≥ 0) to outlying expressions and larger weights (≤ 1) to typical expressions. The distribution of the β-weights is used to calculate the cut-off point, which is compared to the observed β-weight of an expression to determine whether that gene expression is an outlier. This weight function plays a key role in unifying the robustness and efficiency of estimation in one-way ANOVA. Conclusion Analyses of simulated gene expression profiles revealed that all eight methods (ANOVA, SAM, LIMMA, EBarrays, eLNN, KW, robust BetaEB and proposed) perform almost identically for m = 2 conditions in the absence of outliers. However, the robust BetaEB method and the proposed method exhibited considerably better performance than the other six methods in the presence of outliers. In this case, the BetaEB method exhibited slightly better performance than the proposed method for the small-sample cases, but the the proposed method exhibited much better performance than the BetaEB method for both the small- and large-sample cases in the presence of more than 50% outlying genes. The proposed method also exhibited better performance than the other methods for m > 2 conditions with multiple patterns of expression, where the BetaEB was not extended for this condition. Therefore, the proposed approach would be more suitable and reliable on average for the identification of DE genes between two or more conditions with multiple patterns of expression. PMID:26413858
NASA Astrophysics Data System (ADS)
Mazidi, Hesam; Nehorai, Arye; Lew, Matthew D.
2018-02-01
In single-molecule (SM) super-resolution microscopy, the complexity of a biological structure, high molecular density, and a low signal-to-background ratio (SBR) may lead to imaging artifacts without a robust localization algorithm. Moreover, engineered point spread functions (PSFs) for 3D imaging pose difficulties due to their intricate features. We develop a Robust Statistical Estimation algorithm, called RoSE, that enables joint estimation of the 3D location and photon counts of SMs accurately and precisely using various PSFs under conditions of high molecular density and low SBR.
Universal properties of knotted polymer rings.
Baiesi, M; Orlandini, E
2012-09-01
By performing Monte Carlo sampling of N-steps self-avoiding polygons embedded on different Bravais lattices we explore the robustness of universality in the entropic, metric, and geometrical properties of knotted polymer rings. In particular, by simulating polygons with N up to 10(5) we furnish a sharp estimate of the asymptotic values of the knot probability ratios and show their independence on the lattice type. This universal feature was previously suggested, although with different estimates of the asymptotic values. In addition, we show that the scaling behavior of the mean-squared radius of gyration of polygons depends on their knot type only through its correction to scaling. Finally, as a measure of the geometrical self-entanglement of the self-avoiding polygons we consider the standard deviation of the writhe distribution and estimate its power-law behavior in the large N limit. The estimates of the power exponent do depend neither on the lattice nor on the knot type, strongly supporting an extension of the universality property to some features of the geometrical entanglement.
A robust close-range photogrammetric target extraction algorithm for size and type variant targets
NASA Astrophysics Data System (ADS)
Nyarko, Kofi; Thomas, Clayton; Torres, Gilbert
2016-05-01
The Photo-G program conducted by Naval Air Systems Command at the Atlantic Test Range in Patuxent River, Maryland, uses photogrammetric analysis of large amounts of real-world imagery to characterize the motion of objects in a 3-D scene. Current approaches involve several independent processes including target acquisition, target identification, 2-D tracking of image features, and 3-D kinematic state estimation. Each process has its own inherent complications and corresponding degrees of both human intervention and computational complexity. One approach being explored for automated target acquisition relies on exploiting the pixel intensity distributions of photogrammetric targets, which tend to be patterns with bimodal intensity distributions. The bimodal distribution partitioning algorithm utilizes this distribution to automatically deconstruct a video frame into regions of interest (ROI) that are merged and expanded to target boundaries, from which ROI centroids are extracted to mark target acquisition points. This process has proved to be scale, position and orientation invariant, as well as fairly insensitive to global uniform intensity disparities.
An improved method for bivariate meta-analysis when within-study correlations are unknown.
Hong, Chuan; D Riley, Richard; Chen, Yong
2018-03-01
Multivariate meta-analysis, which jointly analyzes multiple and possibly correlated outcomes in a single analysis, is becoming increasingly popular in recent years. An attractive feature of the multivariate meta-analysis is its ability to account for the dependence between multiple estimates from the same study. However, standard inference procedures for multivariate meta-analysis require the knowledge of within-study correlations, which are usually unavailable. This limits standard inference approaches in practice. Riley et al proposed a working model and an overall synthesis correlation parameter to account for the marginal correlation between outcomes, where the only data needed are those required for a separate univariate random-effects meta-analysis. As within-study correlations are not required, the Riley method is applicable to a wide variety of evidence synthesis situations. However, the standard variance estimator of the Riley method is not entirely correct under many important settings. As a consequence, the coverage of a function of pooled estimates may not reach the nominal level even when the number of studies in the multivariate meta-analysis is large. In this paper, we improve the Riley method by proposing a robust variance estimator, which is asymptotically correct even when the model is misspecified (ie, when the likelihood function is incorrect). Simulation studies of a bivariate meta-analysis, in a variety of settings, show a function of pooled estimates has improved performance when using the proposed robust variance estimator. In terms of individual pooled estimates themselves, the standard variance estimator and robust variance estimator give similar results to the original method, with appropriate coverage. The proposed robust variance estimator performs well when the number of studies is relatively large. Therefore, we recommend the use of the robust method for meta-analyses with a relatively large number of studies (eg, m≥50). When the sample size is relatively small, we recommend the use of the robust method under the working independence assumption. We illustrate the proposed method through 2 meta-analyses. Copyright © 2017 John Wiley & Sons, Ltd.
Toward Robust Estimation of the Components of Forest Population Change
Francis A. Roesch
2014-01-01
Multiple levels of simulation are used to test the robustness of estimators of the components of change. I first created a variety of spatial-temporal populations based on, but more variable than, an actual forest monitoring data set and then sampled those populations under a variety of sampling error structures. The performance of each of four estimation approaches is...
ERIC Educational Resources Information Center
Thissen, David; Wainer, Howard
Simulation studies of the performance of (potentially) robust statistical estimation produce large quantities of numbers in the form of performance indices of the various estimators under various conditions. This report presents a multivariate graphical display used to aid in the digestion of the plentiful results in a current study of Item…
NASA Astrophysics Data System (ADS)
Zhang, Wenyan; Daewel, Ute; Schrum, Corinna; Wirtz, Kai
2017-04-01
The mutual dependency between sedimentary total organic carbon (TOC) and benthic macrofauna is here for the first time quantified by a mechanistic model. The model describes (i) the vertical distribution of infaunal biomass resulting from a trade-off between nutritional benefit (quantity and quality of TOC) and the costs of burial (respiration), and (ii) the variable distribution of TOC being in turn shaped by bioturbation of local macrobenthos. In contrast to state-of-the-art diagenetic models, our approach resolves variations of bioturbation both in space and time, which depend on the macrobenthic community structure and biomass. Our implementation of the dynamic interaction between sedimentary organic carbon and infaunal macrobenthos is able to capture a real-time benthic response to both depositional and erosional events and provides improved estimates of the material exchange flux at the sediment-water interface. Applications to literature data for the North Sea demonstrate the robustness and accuracy of the model and its potential as an analysis tool for the status of TOC as well as benthic infauna in marine sediments. The model was coupled to two different 3D hydrodynamic-ecological models (ECOSMO and MOSSCO for 10 x 10 and 1 x 1 km setups, respectively) to evaluate the robustness of the estimates with respect to variable forcings on different spatial scales. Hindcast simulations of the benthic status in the southern North Sea from 1980 to 2000 indicate a relatively stable pattern at large temporal and spatial scales but significant variations at small scales.
Zhao, Junbo; Wang, Shaobu; Mili, Lamine; ...
2018-01-08
Here, this paper develops a robust power system state estimation framework with the consideration of measurement correlations and imperfect synchronization. In the framework, correlations of SCADA and Phasor Measurements (PMUs) are calculated separately through unscented transformation and a Vector Auto-Regression (VAR) model. In particular, PMU measurements during the waiting period of two SCADA measurement scans are buffered to develop the VAR model with robustly estimated parameters using projection statistics approach. The latter takes into account the temporal and spatial correlations of PMU measurements and provides redundant measurements to suppress bad data and mitigate imperfect synchronization. In case where the SCADAmore » and PMU measurements are not time synchronized, either the forecasted PMU measurements or the prior SCADA measurements from the last estimation run are leveraged to restore system observability. Then, a robust generalized maximum-likelihood (GM)-estimator is extended to integrate measurement error correlations and to handle the outliers in the SCADA and PMU measurements. Simulation results that stem from a comprehensive comparison with other alternatives under various conditions demonstrate the benefits of the proposed framework.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Junbo; Wang, Shaobu; Mili, Lamine
Here, this paper develops a robust power system state estimation framework with the consideration of measurement correlations and imperfect synchronization. In the framework, correlations of SCADA and Phasor Measurements (PMUs) are calculated separately through unscented transformation and a Vector Auto-Regression (VAR) model. In particular, PMU measurements during the waiting period of two SCADA measurement scans are buffered to develop the VAR model with robustly estimated parameters using projection statistics approach. The latter takes into account the temporal and spatial correlations of PMU measurements and provides redundant measurements to suppress bad data and mitigate imperfect synchronization. In case where the SCADAmore » and PMU measurements are not time synchronized, either the forecasted PMU measurements or the prior SCADA measurements from the last estimation run are leveraged to restore system observability. Then, a robust generalized maximum-likelihood (GM)-estimator is extended to integrate measurement error correlations and to handle the outliers in the SCADA and PMU measurements. Simulation results that stem from a comprehensive comparison with other alternatives under various conditions demonstrate the benefits of the proposed framework.« less
NASA Astrophysics Data System (ADS)
Lawson, Gareth L.; Wiebe, Peter H.; Stanton, Timothy K.; Ashjian, Carin J.
2008-02-01
Methods were refined and tested for identifying the aggregations of Antarctic euphausiids ( Euphausia spp.) and then estimating euphausiid size, abundance, and biomass, based on multi-frequency acoustic survey data. A threshold level of volume backscattering strength for distinguishing euphausiid aggregations from other zooplankton was derived on the basis of published measurements of euphausiid visual acuity and estimates of the minimum density of animals over which an individual can maintain visual contact with its nearest neighbor. Differences in mean volume backscattering strength at 120 and 43 kHz further served to distinguish euphausiids from other sources of scattering. An inversion method was then developed to estimate simultaneously the mean length and density of euphausiids in these acoustically identified aggregations based on measurements of mean volume backscattering strength at four frequencies (43, 120, 200, and 420 kHz). The methods were tested at certain locations within an acoustically surveyed continental shelf region in and around Marguerite Bay, west of the Antarctic Peninsula, where independent evidence was also available from net and video systems. Inversion results at these test sites were similar to net samples for estimated length, but acoustic estimates of euphausiid density exceeded those from nets by one to two orders of magnitude, likely due primarily to avoidance and to a lesser extent to differences in the volumes sampled by the two systems. In a companion study, these methods were applied to the full acoustic survey data in order to examine the distribution of euphausiids in relation to aspects of the physical and biological environment [Lawson, G.L., Wiebe, P.H., Ashjian, C.J., Stanton, T.K., 2008. Euphausiid distribution along the Western Antarctic Peninsula—Part B: Distribution of euphausiid aggregations and biomass, and associations with environmental features. Deep-Sea Research II, this issue [doi:10.1016/j.dsr2.2007.11.014
NASA Astrophysics Data System (ADS)
Addawe, Rizavel C.; Addawe, Joel M.; Magadia, Joselito C.
2016-10-01
Accurate forecasting of dengue cases would significantly improve epidemic prevention and control capabilities. This paper attempts to provide useful models in forecasting dengue epidemic specific to the young and adult population of Baguio City. To capture the seasonal variations in dengue incidence, this paper develops a robust modeling approach to identify and estimate seasonal autoregressive integrated moving average (SARIMA) models in the presence of additive outliers. Since the least squares estimators are not robust in the presence of outliers, we suggest a robust estimation based on winsorized and reweighted least squares estimators. A hybrid algorithm, Differential Evolution - Simulated Annealing (DESA), is used to identify and estimate the parameters of the optimal SARIMA model. The method is applied to the monthly reported dengue cases in Baguio City, Philippines.
NASA Astrophysics Data System (ADS)
Voss, Sebastian; Zimmermann, Beate; Zimmermann, Alexander
2016-09-01
In the last decades, an increasing number of studies analyzed spatial patterns in throughfall by means of variograms. The estimation of the variogram from sample data requires an appropriate sampling scheme: most importantly, a large sample and a layout of sampling locations that often has to serve both variogram estimation and geostatistical prediction. While some recommendations on these aspects exist, they focus on Gaussian data and high ratios of the variogram range to the extent of the study area. However, many hydrological data, and throughfall data in particular, do not follow a Gaussian distribution. In this study, we examined the effect of extent, sample size, sampling design, and calculation method on variogram estimation of throughfall data. For our investigation, we first generated non-Gaussian random fields based on throughfall data with large outliers. Subsequently, we sampled the fields with three extents (plots with edge lengths of 25 m, 50 m, and 100 m), four common sampling designs (two grid-based layouts, transect and random sampling) and five sample sizes (50, 100, 150, 200, 400). We then estimated the variogram parameters by method-of-moments (non-robust and robust estimators) and residual maximum likelihood. Our key findings are threefold. First, the choice of the extent has a substantial influence on the estimation of the variogram. A comparatively small ratio of the extent to the correlation length is beneficial for variogram estimation. Second, a combination of a minimum sample size of 150, a design that ensures the sampling of small distances and variogram estimation by residual maximum likelihood offers a good compromise between accuracy and efficiency. Third, studies relying on method-of-moments based variogram estimation may have to employ at least 200 sampling points for reliable variogram estimates. These suggested sample sizes exceed the number recommended by studies dealing with Gaussian data by up to 100 %. Given that most previous throughfall studies relied on method-of-moments variogram estimation and sample sizes ≪200, currently available data are prone to large uncertainties.
Geophysical Parameter Estimation of Near Surface Materials Using Nuclear Magnetic Resonance
NASA Astrophysics Data System (ADS)
Keating, K.
2017-12-01
Proton nuclear magnetic resonance (NMR), a mature geophysical technology used in petroleum applications, has recently emerged as a promising tool for hydrogeophysicists. The NMR measurement, which can be made in the laboratory, in boreholes, and using a surface based instrument, are unique in that it is directly sensitive to water, via the initial signal magnitude, and thus provides a robust estimate of water content. In the petroleum industry rock physics models have been established that relate NMR relaxation times to pore size distributions and permeability. These models are often applied directly for hydrogeophysical applications, despite differences in the material in these two environments (e.g., unconsolidated versus consolidated, and mineral content). Furthermore, the rock physics models linking NMR relaxation times to pore size distributions do not account for partially saturated systems that are important for understanding flow in the vadose zone. In our research, we are developing and refining quantitative rock physics models that relate NMR parameters to hydrogeological parameters. Here we highlight the limitations of directly applying established rock physics models to estimate hydrogeological parameters from NMR measurements, and show some of the successes we have had in model improvement. Using examples drawn from both laboratory and field measurements, we focus on the use of NMR in partial saturated systems to estimate water content, pore-size distributions, and the water retention curve. Despite the challenges in interpreting the measurements, valuable information about hydrogeological parameters can be obtained from NMR relaxation data, and we conclude by outlining pathways for improving the interpretation of NMR data for hydrogeophysical investigations.
A new zonation algorithm with parameter estimation using hydraulic head and subsidence observations.
Zhang, Meijing; Burbey, Thomas J; Nunes, Vitor Dos Santos; Borggaard, Jeff
2014-01-01
Parameter estimation codes such as UCODE_2005 are becoming well-known tools in groundwater modeling investigations. These programs estimate important parameter values such as transmissivity (T) and aquifer storage values (Sa ) from known observations of hydraulic head, flow, or other physical quantities. One drawback inherent in these codes is that the parameter zones must be specified by the user. However, such knowledge is often unknown even if a detailed hydrogeological description is available. To overcome this deficiency, we present a discrete adjoint algorithm for identifying suitable zonations from hydraulic head and subsidence measurements, which are highly sensitive to both elastic (Sske) and inelastic (Sskv) skeletal specific storage coefficients. With the advent of interferometric synthetic aperture radar (InSAR), distributed spatial and temporal subsidence measurements can be obtained. A synthetic conceptual model containing seven transmissivity zones, one aquifer storage zone and three interbed zones for elastic and inelastic storage coefficients were developed to simulate drawdown and subsidence in an aquifer interbedded with clay that exhibits delayed drainage. Simulated delayed land subsidence and groundwater head data are assumed to be the observed measurements, to which the discrete adjoint algorithm is called to create approximate spatial zonations of T, Sske , and Sskv . UCODE-2005 is then used to obtain the final optimal parameter values. Calibration results indicate that the estimated zonations calculated from the discrete adjoint algorithm closely approximate the true parameter zonations. This automation algorithm reduces the bias established by the initial distribution of zones and provides a robust parameter zonation distribution. © 2013, National Ground Water Association.
Autonomous intelligent assembly systems LDRD 105746 final report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Robert J.
2013-04-01
This report documents a three-year to develop technology that enables mobile robots to perform autonomous assembly tasks in unstructured outdoor environments. This is a multi-tier problem that requires an integration of a large number of different software technologies including: command and control, estimation and localization, distributed communications, object recognition, pose estimation, real-time scanning, and scene interpretation. Although ultimately unsuccessful in achieving a target brick stacking task autonomously, numerous important component technologies were nevertheless developed. Such technologies include: a patent-pending polygon snake algorithm for robust feature tracking, a color grid algorithm for uniquely identification and calibration, a command and control frameworkmore » for abstracting robot commands, a scanning capability that utilizes a compact robot portable scanner, and more. This report describes this project and these developed technologies.« less
Bayesian experimental design for models with intractable likelihoods.
Drovandi, Christopher C; Pettitt, Anthony N
2013-12-01
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
Robust estimation procedure in panel data model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shariff, Nurul Sima Mohamad; Hamzah, Nor Aishah
2014-06-19
The panel data modeling has received a great attention in econometric research recently. This is due to the availability of data sources and the interest to study cross sections of individuals observed over time. However, the problems may arise in modeling the panel in the presence of cross sectional dependence and outliers. Even though there are few methods that take into consideration the presence of cross sectional dependence in the panel, the methods may provide inconsistent parameter estimates and inferences when outliers occur in the panel. As such, an alternative method that is robust to outliers and cross sectional dependencemore » is introduced in this paper. The properties and construction of the confidence interval for the parameter estimates are also considered in this paper. The robustness of the procedure is investigated and comparisons are made to the existing method via simulation studies. Our results have shown that robust approach is able to produce an accurate and reliable parameter estimates under the condition considered.« less
Robust GNSS and InSAR tomography of neutrospheric refractivity using a Compressive Sensing approach
NASA Astrophysics Data System (ADS)
Heublein, Marion; Alshawaf, Fadwa; Zhu, Xiao Xiang; Hinz, Stefan
2017-04-01
Motivation: An accurate knowledge of the 3D distribution of water vapor in the atmosphere is a key element for weather forecasting and climate research. In addition, a precise determination of water vapor is also required for accurate positioning and deformation monitoring using Global Navigation Satellite Systems (GNSS) and Interferometric Synthetic Aperture Radar (InSAR). Several approaches for 3D tomographic water vapor reconstruction from GNSS-based Slant Wet Delay (SWD) estimates using the least squares (LSQ) adjustment exist. However, the tomographic system is in general ill-conditioned and its solution is unstable. Therefore, additional information or constraints need to be added in order to regularize the system. Goal of this work: In this work, we analyze the potential of Compressive Sensing (CS) for robustly reconstructing neutrospheric refractivity from GNSS SWD estimates. Moreover, the benefit of adding InSAR SWD estimates into the tomographic system is studied. Approach: A sparse representation of the refractivity field is obtained using a dictionary composed of Discrete Cosine Transforms (DCT) in longitude and latitude direction and of an Euler transform in height direction. This sparsity of the signal can be used as a prior for regularization and the CS inversion is solved by minimizing the number of non-zero entries of the sparse solution in the DCT-Euler domain. No other regularization constraints or prior knowledge is applied. The tomographic reconstruction relies on total SWD estimates from GNSS Precise Point Positioning (PPP) and Persistent Scatterer (PS) InSAR. On the one hand, GNSS PPP SWD estimates are included into the system of equations. On the other hand, 2D ZWD maps are obtained by a combination of point-wise estimates of the wet delay using GNSS observations and partial InSAR wet delay maps. These ZWD estimates are aggregated to derive realistic wet delay input data at given points as if corresponding to GNSS sites within the study area. The made-up ZWD values can be mapped into different elevation and azimuth angles. Moreover, using the same observation geometry as in the case of the GNSS and InSAR data, a synthetic set of SWD values was generated based on WRF simulations. Results: The CS approach shows particular strength in the case of a small number of SWD estimates. When compared to LSQ, the sparse reconstruction is much more robust. In the case of a low density of GNSS sites, adding InSAR SWD estimates improves the reconstruction accuracy for both LSQ and CS. Based on a synthetic SWD dataset generated using WRF simulations of wet refractivity, the CS based solution of the tomographic system is validated. In the vertical direction, the refractivity distribution deduced from GNSS and InSAR SWD estimates is compared to a tropospheric humidity data set provided by EUMETSAT consisting of daily mean values of specific humidity given on six pressure levels between 1000 hPa and 200 hPa. Study area: The Upper Rhine Graben (URG) characterized by negligible surface deformations is chosen as study area. A network of seven permanent GNSS receivers is used for this study, and a total number of 17 SAR images, acquired by ENVISAT ASAR is available.
Abundance models improve spatial and temporal prioritization of conservation resources.
Johnston, Alison; Fink, Daniel; Reynolds, Mark D; Hochachka, Wesley M; Sullivan, Brian L; Bruns, Nicholas E; Hallstein, Eric; Merrifield, Matt S; Matsumoto, Sandi; Kelling, Steve
2015-10-01
Conservation prioritization requires knowledge about organism distribution and density. This information is often inferred from models that estimate the probability of species occurrence rather than from models that estimate species abundance, because abundance data are harder to obtain and model. However, occurrence and abundance may not display similar patterns and therefore development of robust, scalable, abundance models is critical to ensuring that scarce conservation resources are applied where they can have the greatest benefits. Motivated by a dynamic land conservation program, we develop and assess a general method for modeling relative abundance using citizen science monitoring data. Weekly estimates of relative abundance and occurrence were compared for prioritizing times and locations of conservation actions for migratory waterbird species in California, USA. We found that abundance estimates consistently provided better rankings of observed counts than occurrence estimates. Additionally, the relationship between abundance and occurrence was nonlinear and varied by species and season. Across species, locations prioritized by occurrence models had only 10-58% overlap with locations prioritized by abundance models, highlighting that occurrence models will not typically identify the locations of highest abundance that are vital for conservation of populations.
Xiao, Yongling; Abrahamowicz, Michal
2010-03-30
We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
Robust Variable Selection with Exponential Squared Loss.
Wang, Xueqin; Jiang, Yunlu; Huang, Mian; Zhang, Heping
2013-04-01
Robust variable selection procedures through penalized regression have been gaining increased attention in the literature. They can be used to perform variable selection and are expected to yield robust estimates. However, to the best of our knowledge, the robustness of those penalized regression procedures has not been well characterized. In this paper, we propose a class of penalized robust regression estimators based on exponential squared loss. The motivation for this new procedure is that it enables us to characterize its robustness that has not been done for the existing procedures, while its performance is near optimal and superior to some recently developed methods. Specifically, under defined regularity conditions, our estimators are [Formula: see text] and possess the oracle property. Importantly, we show that our estimators can achieve the highest asymptotic breakdown point of 1/2 and that their influence functions are bounded with respect to the outliers in either the response or the covariate domain. We performed simulation studies to compare our proposed method with some recent methods, using the oracle method as the benchmark. We consider common sources of influential points. Our simulation studies reveal that our proposed method performs similarly to the oracle method in terms of the model error and the positive selection rate even in the presence of influential points. In contrast, other existing procedures have a much lower non-causal selection rate. Furthermore, we re-analyze the Boston Housing Price Dataset and the Plasma Beta-Carotene Level Dataset that are commonly used examples for regression diagnostics of influential points. Our analysis unravels the discrepancies of using our robust method versus the other penalized regression method, underscoring the importance of developing and applying robust penalized regression methods.
Robust Variable Selection with Exponential Squared Loss
Wang, Xueqin; Jiang, Yunlu; Huang, Mian; Zhang, Heping
2013-01-01
Robust variable selection procedures through penalized regression have been gaining increased attention in the literature. They can be used to perform variable selection and are expected to yield robust estimates. However, to the best of our knowledge, the robustness of those penalized regression procedures has not been well characterized. In this paper, we propose a class of penalized robust regression estimators based on exponential squared loss. The motivation for this new procedure is that it enables us to characterize its robustness that has not been done for the existing procedures, while its performance is near optimal and superior to some recently developed methods. Specifically, under defined regularity conditions, our estimators are n-consistent and possess the oracle property. Importantly, we show that our estimators can achieve the highest asymptotic breakdown point of 1/2 and that their influence functions are bounded with respect to the outliers in either the response or the covariate domain. We performed simulation studies to compare our proposed method with some recent methods, using the oracle method as the benchmark. We consider common sources of influential points. Our simulation studies reveal that our proposed method performs similarly to the oracle method in terms of the model error and the positive selection rate even in the presence of influential points. In contrast, other existing procedures have a much lower non-causal selection rate. Furthermore, we re-analyze the Boston Housing Price Dataset and the Plasma Beta-Carotene Level Dataset that are commonly used examples for regression diagnostics of influential points. Our analysis unravels the discrepancies of using our robust method versus the other penalized regression method, underscoring the importance of developing and applying robust penalized regression methods. PMID:23913996
Tissue resistivity estimation in the presence of positional and geometrical uncertainties.
Baysal, U; Eyüboğlu, B M
2000-08-01
Geometrical uncertainties (organ boundary variation and electrode position uncertainties) are the biggest sources of error in estimating electrical resistivity of tissues from body surface measurements. In this study, in order to decrease estimation errors, the statistically constrained minimum mean squared error estimation algorithm (MiMSEE) is constrained with a priori knowledge of the geometrical uncertainties in addition to the constraints based on geometry, resistivity range, linearization and instrumentation errors. The MiMSEE calculates an optimum inverse matrix, which maps the surface measurements to the unknown resistivity distribution. The required data are obtained from four-electrode impedance measurements, similar to injected-current electrical impedance tomography (EIT). In this study, the surface measurements are simulated by using a numerical thorax model. The data are perturbed with additive instrumentation noise. Simulated surface measurements are then used to estimate the tissue resistivities by using the proposed algorithm. The results are compared with the results of conventional least squares error estimator (LSEE). Depending on the region, the MiMSEE yields an estimation error between 0.42% and 31.3% compared with 7.12% to 2010% for the LSEE. It is shown that the MiMSEE is quite robust even in the case of geometrical uncertainties.
Testing hypotheses on distribution shifts and changes in phenology of imperfectly detectable species
Chambert, Thierry A.; Kendall, William L.; Hines, James E.; Nichols, James D.; Pedrini, Paolo; Waddle, J. Hardin; Tavecchia, Giacomo; Walls, Susan C.; Tenan, Simone
2015-01-01
With ongoing climate change, many species are expected to shift their spatial and temporal distributions. To document changes in species distribution and phenology, detection/non-detection data have proven very useful. Occupancy models provide a robust way to analyse such data, but inference is usually focused on species spatial distribution, not phenology.We present a multi-season extension of the staggered-entry occupancy model of Kendall et al. (2013, Ecology, 94, 610), which permits inference about the within-season patterns of species arrival and departure at sampling sites. The new model presented here allows investigation of species phenology and spatial distribution across years, as well as site extinction/colonization dynamics.We illustrate the model with two data sets on European migratory passerines and one data set on North American treefrogs. We show how to derive several additional phenological parameters, such as annual mean arrival and departure dates, from estimated arrival and departure probabilities.Given the extent of detection/non-detection data that are available, we believe that this modelling approach will prove very useful to further understand and predict species responses to climate change.
NASA Astrophysics Data System (ADS)
Niu, Chun-Yang; Qi, Hong; Huang, Xing; Ruan, Li-Ming; Tan, He-Ping
2016-11-01
A rapid computational method called generalized sourced multi-flux method (GSMFM) was developed to simulate outgoing radiative intensities in arbitrary directions at the boundary surfaces of absorbing, emitting, and scattering media which were served as input for the inverse analysis. A hybrid least-square QR decomposition-stochastic particle swarm optimization (LSQR-SPSO) algorithm based on the forward GSMFM solution was developed to simultaneously reconstruct multi-dimensional temperature distribution and absorption and scattering coefficients of the cylindrical participating media. The retrieval results for axisymmetric temperature distribution and non-axisymmetric temperature distribution indicated that the temperature distribution and scattering and absorption coefficients could be retrieved accurately using the LSQR-SPSO algorithm even with noisy data. Moreover, the influences of extinction coefficient and scattering albedo on the accuracy of the estimation were investigated, and the results suggested that the reconstruction accuracy decreased with the increase of extinction coefficient and the scattering albedo. Finally, a non-contact measurement platform of flame temperature field based on the light field imaging was set up to validate the reconstruction model experimentally.
Bias and robustness of uncertainty components estimates in transient climate projections
NASA Astrophysics Data System (ADS)
Hingray, Benoit; Blanchet, Juliette; Jean-Philippe, Vidal
2016-04-01
A critical issue in climate change studies is the estimation of uncertainties in projections along with the contribution of the different uncertainty sources, including scenario uncertainty, the different components of model uncertainty and internal variability. Quantifying the different uncertainty sources faces actually different problems. For instance and for the sake of simplicity, an estimate of model uncertainty is classically obtained from the empirical variance of the climate responses obtained for the different modeling chains. These estimates are however biased. Another difficulty arises from the limited number of members that are classically available for most modeling chains. In this case, the climate response of one given chain and the effect of its internal variability may be actually difficult if not impossible to separate. The estimate of scenario uncertainty, model uncertainty and internal variability components are thus likely to be not really robust. We explore the importance of the bias and the robustness of the estimates for two classical Analysis of Variance (ANOVA) approaches: a Single Time approach (STANOVA), based on the only data available for the considered projection lead time and a time series based approach (QEANOVA), which assumes quasi-ergodicity of climate outputs over the whole available climate simulation period (Hingray and Saïd, 2014). We explore both issues for a simple but classical configuration where uncertainties in projections are composed of two single sources: model uncertainty and internal climate variability. The bias in model uncertainty estimates is explored from theoretical expressions of unbiased estimators developed for both ANOVA approaches. The robustness of uncertainty estimates is explored for multiple synthetic ensembles of time series projections generated with MonteCarlo simulations. For both ANOVA approaches, when the empirical variance of climate responses is used to estimate model uncertainty, the bias is always positive. It can be especially high with STANOVA. In the most critical configurations, when the number of members available for each modeling chain is small (< 3) and when internal variability explains most of total uncertainty variance (75% or more), the overestimation is higher than 100% of the true model uncertainty variance. The bias can be considerably reduced with a time series ANOVA approach, owing to the multiple time steps accounted for. The longer the transient time period used for the analysis, the larger the reduction. When a quasi-ergodic ANOVA approach is applied to decadal data for the whole 1980-2100 period, the bias is reduced by a factor 2.5 to 20 depending on the projection lead time. In all cases, the bias is likely to be not negligible for a large number of climate impact studies resulting in a likely large overestimation of the contribution of model uncertainty to total variance. For both approaches, the robustness of all uncertainty estimates is higher when more members are available, when internal variability is smaller and/or the response-to-uncertainty ratio is higher. QEANOVA estimates are much more robust than STANOVA ones: QEANOVA simulated confidence intervals are roughly 3 to 5 times smaller than STANOVA ones. Excepted for STANOVA when less than 3 members is available, the robustness is rather high for total uncertainty and moderate for internal variability estimates. For model uncertainty or response-to-uncertainty ratio estimates, the robustness is conversely low for QEANOVA to very low for STANOVA. In the most critical configurations (small number of member, large internal variability), large over- or underestimation of uncertainty components is very thus likely. To propose relevant uncertainty analyses and avoid misleading interpretations, estimates of uncertainty components should be therefore bias corrected and ideally come with estimates of their robustness. This work is part of the COMPLEX Project (European Collaborative Project FP7-ENV-2012 number: 308601; http://www.complex.ac.uk/). Hingray, B., Saïd, M., 2014. Partitioning internal variability and model uncertainty components in a multimodel multireplicate ensemble of climate projections. J.Climate. doi:10.1175/JCLI-D-13-00629.1 Hingray, B., Blanchet, J. (revision) Unbiased estimators for uncertainty components in transient climate projections. J. Climate Hingray, B., Blanchet, J., Vidal, J.P. (revision) Robustness of uncertainty components estimates in climate projections. J.Climate
Estimating the Area Under ROC Curve When the Fitted Binormal Curves Demonstrate Improper Shape.
Bandos, Andriy I; Guo, Ben; Gur, David
2017-02-01
The "binormal" model is the most frequently used tool for parametric receiver operating characteristic (ROC) analysis. The binormal ROC curves can have "improper" (non-concave) shapes that are unrealistic in many practical applications, and several tools (eg, PROPROC) have been developed to address this problem. However, due to the general robustness of binormal ROCs, the improperness of the fitted curves might carry little consequence for inferences about global summary indices, such as the area under the ROC curve (AUC). In this work, we investigate the effect of severe improperness of fitted binormal ROC curves on the reliability of AUC estimates when the data arise from an actually proper curve. We designed theoretically proper ROC scenarios that induce severely improper shape of fitted binormal curves in the presence of well-distributed empirical ROC points. The binormal curves were fitted using maximum likelihood approach. Using simulations, we estimated the frequency of severely improper fitted curves, bias of the estimated AUC, and coverage of 95% confidence intervals (CIs). In Appendix S1, we provide additional information on percentiles of the distribution of AUC estimates and bias when estimating partial AUCs. We also compared the results to a reference standard provided by empirical estimates obtained from continuous data. We observed up to 96% of severely improper curves depending on the scenario in question. The bias in the binormal AUC estimates was very small and the coverage of the CIs was close to nominal, whereas the estimates of partial AUC were biased upward in the high specificity range and downward in the low specificity range. Compared to a non-parametric approach, the binormal model led to slightly more variable AUC estimates, but at the same time to CIs with more appropriate coverage. The improper shape of the fitted binormal curve, by itself, ie, in the presence of a sufficient number of well-distributed points, does not imply unreliable AUC-based inferences. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
The WorkPlace distributed processing environment
NASA Technical Reports Server (NTRS)
Ames, Troy; Henderson, Scott
1993-01-01
Real time control problems require robust, high performance solutions. Distributed computing can offer high performance through parallelism and robustness through redundancy. Unfortunately, implementing distributed systems with these characteristics places a significant burden on the applications programmers. Goddard Code 522 has developed WorkPlace to alleviate this burden. WorkPlace is a small, portable, embeddable network interface which automates message routing, failure detection, and re-configuration in response to failures in distributed systems. This paper describes the design and use of WorkPlace, and its application in the construction of a distributed blackboard system.
Robust detection, isolation and accommodation for sensor failures
NASA Technical Reports Server (NTRS)
Emami-Naeini, A.; Akhter, M. M.; Rock, S. M.
1986-01-01
The objective is to extend the recent advances in robust control system design of multivariable systems to sensor failure detection, isolation, and accommodation (DIA), and estimator design. This effort provides analysis tools to quantify the trade-off between performance robustness and DIA sensitivity, which are to be used to achieve higher levels of performance robustness for given levels of DIA sensitivity. An innovations-based DIA scheme is used. Estimators, which depend upon a model of the process and process inputs and outputs, are used to generate these innovations. Thresholds used to determine failure detection are computed based on bounds on modeling errors, noise properties, and the class of failures. The applicability of the newly developed tools are demonstrated on a multivariable aircraft turbojet engine example. A new concept call the threshold selector was developed. It represents a significant and innovative tool for the analysis and synthesis of DiA algorithms. The estimators were made robust by introduction of an internal model and by frequency shaping. The internal mode provides asymptotically unbiased filter estimates.The incorporation of frequency shaping of the Linear Quadratic Gaussian cost functional modifies the estimator design to make it suitable for sensor failure DIA. The results are compared with previous studies which used thresholds that were selcted empirically. Comparison of these two techniques on a nonlinear dynamic engine simulation shows improved performance of the new method compared to previous techniques
Trong Bui, Duong; Nguyen, Nhan Duc; Jeong, Gu-Min
2018-06-25
Human activity recognition and pedestrian dead reckoning are an interesting field because of their importance utilities in daily life healthcare. Currently, these fields are facing many challenges, one of which is the lack of a robust algorithm with high performance. This paper proposes a new method to implement a robust step detection and adaptive distance estimation algorithm based on the classification of five daily wrist activities during walking at various speeds using a smart band. The key idea is that the non-parametric adaptive distance estimator is performed after two activity classifiers and a robust step detector. In this study, two classifiers perform two phases of recognizing five wrist activities during walking. Then, a robust step detection algorithm, which is integrated with an adaptive threshold, peak and valley correction algorithm, is applied to the classified activities to detect the walking steps. In addition, the misclassification activities are fed back to the previous layer. Finally, three adaptive distance estimators, which are based on a non-parametric model of the average walking speed, calculate the length of each strike. The experimental results show that the average classification accuracy is about 99%, and the accuracy of the step detection is 98.7%. The error of the estimated distance is 2.2⁻4.2% depending on the type of wrist activities.
Xu, Yonghong; Gao, Xiaohuan; Wang, Zhengxi
2014-04-01
Missing data represent a general problem in many scientific fields, especially in medical survival analysis. Dealing with censored data, interpolation method is one of important methods. However, most of the interpolation methods replace the censored data with the exact data, which will distort the real distribution of the censored data and reduce the probability of the real data falling into the interpolation data. In order to solve this problem, we in this paper propose a nonparametric method of estimating the survival function of right-censored and interval-censored data and compare its performance to SC (self-consistent) algorithm. Comparing to the average interpolation and the nearest neighbor interpolation method, the proposed method in this paper replaces the right-censored data with the interval-censored data, and greatly improves the probability of the real data falling into imputation interval. Then it bases on the empirical distribution theory to estimate the survival function of right-censored and interval-censored data. The results of numerical examples and a real breast cancer data set demonstrated that the proposed method had higher accuracy and better robustness for the different proportion of the censored data. This paper provides a good method to compare the clinical treatments performance with estimation of the survival data of the patients. This pro vides some help to the medical survival data analysis.
NASA Astrophysics Data System (ADS)
Ait-El-Fquih, Boujemaa; El Gharamti, Mohamad; Hoteit, Ibrahim
2016-08-01
Ensemble Kalman filtering (EnKF) is an efficient approach to addressing uncertainties in subsurface groundwater models. The EnKF sequentially integrates field data into simulation models to obtain a better characterization of the model's state and parameters. These are generally estimated following joint and dual filtering strategies, in which, at each assimilation cycle, a forecast step by the model is followed by an update step with incoming observations. The joint EnKF directly updates the augmented state-parameter vector, whereas the dual EnKF empirically employs two separate filters, first estimating the parameters and then estimating the state based on the updated parameters. To develop a Bayesian consistent dual approach and improve the state-parameter estimates and their consistency, we propose in this paper a one-step-ahead (OSA) smoothing formulation of the state-parameter Bayesian filtering problem from which we derive a new dual-type EnKF, the dual EnKFOSA. Compared with the standard dual EnKF, it imposes a new update step to the state, which is shown to enhance the performance of the dual approach with almost no increase in the computational cost. Numerical experiments are conducted with a two-dimensional (2-D) synthetic groundwater aquifer model to investigate the performance and robustness of the proposed dual EnKFOSA, and to evaluate its results against those of the joint and dual EnKFs. The proposed scheme is able to successfully recover both the hydraulic head and the aquifer conductivity, providing further reliable estimates of their uncertainties. Furthermore, it is found to be more robust to different assimilation settings, such as the spatial and temporal distribution of the observations, and the level of noise in the data. Based on our experimental setups, it yields up to 25 % more accurate state and parameter estimations than the joint and dual approaches.
Efficient and optimized identification of generalized Maxwell viscoelastic relaxation spectra
Babaei, Behzad; Davarian, Ali; Pryse, Kenneth M.; Elson, Elliot L.; Genin, Guy M.
2017-01-01
Viscoelastic relaxation spectra are essential for predicting and interpreting the mechanical responses of materials and structures. For biological tissues, these spectra must usually be estimated from viscoelastic relaxation tests. Interpreting viscoelastic relaxation tests is challenging because the inverse problem is expensive computationally. We present here an efficient algorithm that enables rapid identification of viscoelastic relaxation spectra. The algorithm was tested against trial data to characterize its robustness and identify its limitations and strengths. The algorithm was then applied to identify the viscoelastic response of reconstituted collagen, revealing an extensive distribution of viscoelastic time constants. PMID:26523785
Catastrophic expenditure to pay for surgery worldwide: a modelling study.
Shrime, Mark G; Dare, Anna J; Alkire, Blake C; O'Neill, Kathleen; Meara, John G
2015-04-27
Approximately 150 million individuals worldwide face catastrophic expenditure each year from medical costs alone, and the non-medical costs of accessing care increase that number. The proportion of this expenditure related to surgery is unknown. Because the World Bank has proposed elimination of medical impoverishment by 2030, the effect of surgical conditions on financial catastrophe should be quantified so that any financial risk protection mechanisms can appropriately incorporate surgery. To estimate the global incidence of catastrophic expenditure due to surgery, we built a stochastic model. The income distribution of each country, the probability of requiring surgery, and the medical and non-medical costs faced for surgery were incorporated. Sensitivity analyses were run to test the robustness of the model. 3·7 billion people (posterior credible interval 3·2-4·2 billion) risk catastrophic expenditure if they need surgery. Each year, 81·3 million people (80·8-81·7 million) worldwide are driven to financial catastrophe-32·8 million (32·4-33·1 million) from the costs of surgery alone and 48·5 million (47·7-49·3) from associated non-medical costs. The burden of catastrophic expenditure is highest in countries of low and middle income; within any country, it falls on the poor. Estimates were sensitive to the definition of catastrophic expenditure and the costs of care. The inequitable burden distribution was robust to model assumptions. Half the global population is at risk of financial catastrophe from surgery. Each year, surgical conditions cause 81 million individuals to face catastrophic expenditure, of which less than half is attributable to medical costs. These findings highlight the need for financial risk protection for surgery in health-system design. MGS received partial funding from NIH/NCI R25CA92203. Copyright © 2015 Shrime et al. Open Access article distributed under the terms of CC BY-NC-ND. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Ivanov, Martin; Warrach-Sagi, Kirsten; Wulfmeyer, Volker
2018-04-01
A new approach for rigorous spatial analysis of the downscaling performance of regional climate model (RCM) simulations is introduced. It is based on a multiple comparison of the local tests at the grid cells and is also known as `field' or `global' significance. The block length for the local resampling tests is precisely determined to adequately account for the time series structure. New performance measures for estimating the added value of downscaled data relative to the large-scale forcing fields are developed. The methodology is exemplarily applied to a standard EURO-CORDEX hindcast simulation with the Weather Research and Forecasting (WRF) model coupled with the land surface model NOAH at 0.11 ∘ grid resolution. Daily precipitation climatology for the 1990-2009 period is analysed for Germany for winter and summer in comparison with high-resolution gridded observations from the German Weather Service. The field significance test controls the proportion of falsely rejected local tests in a meaningful way and is robust to spatial dependence. Hence, the spatial patterns of the statistically significant local tests are also meaningful. We interpret them from a process-oriented perspective. While the downscaled precipitation distributions are statistically indistinguishable from the observed ones in most regions in summer, the biases of some distribution characteristics are significant over large areas in winter. WRF-NOAH generates appropriate stationary fine-scale climate features in the daily precipitation field over regions of complex topography in both seasons and appropriate transient fine-scale features almost everywhere in summer. As the added value of global climate model (GCM)-driven simulations cannot be smaller than this perfect-boundary estimate, this work demonstrates in a rigorous manner the clear additional value of dynamical downscaling over global climate simulations. The evaluation methodology has a broad spectrum of applicability as it is distribution-free, robust to spatial dependence, and accounts for time series structure.
Rickettsia Phylogenomics: Unwinding the Intricacies of Obligate Intracellular Life
Gillespie, Joseph J.; Williams, Kelly; Shukla, Maulik; Snyder, Eric E.; Nordberg, Eric K.; Ceraul, Shane M.; Dharmanolla, Chitti; Rainey, Daphne; Soneja, Jeetendra; Shallom, Joshua M.; Vishnubhat, Nataraj Dongre; Wattam, Rebecca; Purkayastha, Anjan; Czar, Michael; Crasta, Oswald; Setubal, Joao C.; Azad, Abdu F.; Sobral, Bruno S.
2008-01-01
Background Completed genome sequences are rapidly increasing for Rickettsia, obligate intracellular α-proteobacteria responsible for various human diseases, including epidemic typhus and Rocky Mountain spotted fever. In light of phylogeny, the establishment of orthologous groups (OGs) of open reading frames (ORFs) will distinguish the core rickettsial genes and other group specific genes (class 1 OGs or C1OGs) from those distributed indiscriminately throughout the rickettsial tree (class 2 OG or C2OGs). Methodology/Principal Findings We present 1823 representative (no gene duplications) and 259 non-representative (at least one gene duplication) rickettsial OGs. While the highly reductive (∼1.2 MB) Rickettsia genomes range in predicted ORFs from 872 to 1512, a core of 752 OGs was identified, depicting the essential Rickettsia genes. Unsurprisingly, this core lacks many metabolic genes, reflecting the dependence on host resources for growth and survival. Additionally, we bolster our recent reclassification of Rickettsia by identifying OGs that define the AG (ancestral group), TG (typhus group), TRG (transitional group), and SFG (spotted fever group) rickettsiae. OGs for insect-associated species, tick-associated species and species that harbor plasmids were also predicted. Through superimposition of all OGs over robust phylogeny estimation, we discern between C1OGs and C2OGs, the latter depicting genes either decaying from the conserved C1OGs or acquired laterally. Finally, scrutiny of non-representative OGs revealed high levels of split genes versus gene duplications, with both phenomena confounding gene orthology assignment. Interestingly, non-representative OGs, as well as OGs comprised of several gene families typically involved in microbial pathogenicity and/or the acquisition of virulence factors, fall predominantly within C2OG distributions. Conclusion/Significance Collectively, we determined the relative conservation and distribution of 14354 predicted ORFs from 10 rickettsial genomes across robust phylogeny estimation. The data, available at PATRIC (PathoSystems Resource Integration Center), provide novel information for unwinding the intricacies associated with Rickettsia pathogenesis, expanding the range of potential diagnostic, vaccine and therapeutic targets. PMID:19194535
Greenhouse-gas emission targets for limiting global warming to 2 degrees C.
Meinshausen, Malte; Meinshausen, Nicolai; Hare, William; Raper, Sarah C B; Frieler, Katja; Knutti, Reto; Frame, David J; Allen, Myles R
2009-04-30
More than 100 countries have adopted a global warming limit of 2 degrees C or below (relative to pre-industrial levels) as a guiding principle for mitigation efforts to reduce climate change risks, impacts and damages. However, the greenhouse gas (GHG) emissions corresponding to a specified maximum warming are poorly known owing to uncertainties in the carbon cycle and the climate response. Here we provide a comprehensive probabilistic analysis aimed at quantifying GHG emission budgets for the 2000-50 period that would limit warming throughout the twenty-first century to below 2 degrees C, based on a combination of published distributions of climate system properties and observational constraints. We show that, for the chosen class of emission scenarios, both cumulative emissions up to 2050 and emission levels in 2050 are robust indicators of the probability that twenty-first century warming will not exceed 2 degrees C relative to pre-industrial temperatures. Limiting cumulative CO(2) emissions over 2000-50 to 1,000 Gt CO(2) yields a 25% probability of warming exceeding 2 degrees C-and a limit of 1,440 Gt CO(2) yields a 50% probability-given a representative estimate of the distribution of climate system properties. As known 2000-06 CO(2) emissions were approximately 234 Gt CO(2), less than half the proven economically recoverable oil, gas and coal reserves can still be emitted up to 2050 to achieve such a goal. Recent G8 Communiqués envisage halved global GHG emissions by 2050, for which we estimate a 12-45% probability of exceeding 2 degrees C-assuming 1990 as emission base year and a range of published climate sensitivity distributions. Emissions levels in 2020 are a less robust indicator, but for the scenarios considered, the probability of exceeding 2 degrees C rises to 53-87% if global GHG emissions are still more than 25% above 2000 levels in 2020.
TU-AB-BRB-01: Coverage Evaluation and Probabilistic Treatment Planning as a Margin Alternative
DOE Office of Scientific and Technical Information (OSTI.GOV)
Siebers, J.
The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, H.
The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
TU-AB-BRB-02: Stochastic Programming Methods for Handling Uncertainty and Motion in IMRT Planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Unkelbach, J.
The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
TU-AB-BRB-00: New Methods to Ensure Target Coverage
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
2015-06-15
The accepted clinical method to accommodate targeting uncertainties inherent in fractionated external beam radiation therapy is to utilize GTV-to-CTV and CTV-to-PTV margins during the planning process to design a PTV-conformal static dose distribution on the planning image set. Ideally, margins are selected to ensure a high (e.g. >95%) target coverage probability (CP) in spite of inherent inter- and intra-fractional positional variations, tissue motions, and initial contouring uncertainties. Robust optimization techniques, also known as probabilistic treatment planning techniques, explicitly incorporate the dosimetric consequences of targeting uncertainties by including CP evaluation into the planning optimization process along with coverage-based planning objectives. Themore » treatment planner no longer needs to use PTV and/or PRV margins; instead robust optimization utilizes probability distributions of the underlying uncertainties in conjunction with CP-evaluation for the underlying CTVs and OARs to design an optimal treated volume. This symposium will describe CP-evaluation methods as well as various robust planning techniques including use of probability-weighted dose distributions, probability-weighted objective functions, and coverage optimized planning. Methods to compute and display the effect of uncertainties on dose distributions will be presented. The use of robust planning to accommodate inter-fractional setup uncertainties, organ deformation, and contouring uncertainties will be examined as will its use to accommodate intra-fractional organ motion. Clinical examples will be used to inter-compare robust and margin-based planning, highlighting advantages of robust-plans in terms of target and normal tissue coverage. Robust-planning limitations as uncertainties approach zero and as the number of treatment fractions becomes small will be presented, as well as the factors limiting clinical implementation of robust planning. Learning Objectives: To understand robust-planning as a clinical alternative to using margin-based planning. To understand conceptual differences between uncertainty and predictable motion. To understand fundamental limitations of the PTV concept that probabilistic planning can overcome. To understand the major contributing factors to target and normal tissue coverage probability. To understand the similarities and differences of various robust planning techniques To understand the benefits and limitations of robust planning techniques.« less
NASA Astrophysics Data System (ADS)
Liu, Guannan; Liu, Dong
2018-06-01
An improved inverse reconstruction model with consideration of self-absorption effect for the temperature distribution and concentration fields of soot and metal-oxide nanoparticles in nanofluid fuel flames was proposed based on the flame emission spectrometry. The effects of self-absorption on the temperature profile and concentration fields were investigated for various measurement errors, flame optical thicknesses and detecting lines numbers. The model neglecting the self-absorption caused serious reconstruction errors especially in the nanofluid fuel flames with large optical thicknesses, while the improved model was used to successfully recover the temperature distribution and concentration fields of soot and metal-oxide nanoparticles for the flames regardless of the optical thickness. Through increasing detecting lines number, the reconstruction accuracy can be greatly improved due to more flame emission information received by the spectrometer. With the adequate detecting lines number, the estimations for the temperature distribution and concentration fields of soot and metal-oxide nanoparticles in flames with large optical thicknesses were still satisfying even from the noisy radiation intensities with signal to noise ratio (SNR) as low as 46 dB. The results showed that the improved reconstruction model was effective and robust to concurrently retrieve the temperature distribution and volume fraction fields of soot and metal-oxide nanoparticles for the exact and noisy data in nanofluid fuel sooting flames with different optical thicknesses.
Effects of sample size on estimates of population growth rates calculated with matrix models.
Fiske, Ian J; Bruna, Emilio M; Bolker, Benjamin M
2008-08-28
Matrix models are widely used to study the dynamics and demography of populations. An important but overlooked issue is how the number of individuals sampled influences estimates of the population growth rate (lambda) calculated with matrix models. Even unbiased estimates of vital rates do not ensure unbiased estimates of lambda-Jensen's Inequality implies that even when the estimates of the vital rates are accurate, small sample sizes lead to biased estimates of lambda due to increased sampling variance. We investigated if sampling variability and the distribution of sampling effort among size classes lead to biases in estimates of lambda. Using data from a long-term field study of plant demography, we simulated the effects of sampling variance by drawing vital rates and calculating lambda for increasingly larger populations drawn from a total population of 3842 plants. We then compared these estimates of lambda with those based on the entire population and calculated the resulting bias. Finally, we conducted a review of the literature to determine the sample sizes typically used when parameterizing matrix models used to study plant demography. We found significant bias at small sample sizes when survival was low (survival = 0.5), and that sampling with a more-realistic inverse J-shaped population structure exacerbated this bias. However our simulations also demonstrate that these biases rapidly become negligible with increasing sample sizes or as survival increases. For many of the sample sizes used in demographic studies, matrix models are probably robust to the biases resulting from sampling variance of vital rates. However, this conclusion may depend on the structure of populations or the distribution of sampling effort in ways that are unexplored. We suggest more intensive sampling of populations when individual survival is low and greater sampling of stages with high elasticities.
Robust Spacecraft Component Detection in Point Clouds.
Wei, Quanmao; Jiang, Zhiguo; Zhang, Haopeng
2018-03-21
Automatic component detection of spacecraft can assist in on-orbit operation and space situational awareness. Spacecraft are generally composed of solar panels and cuboidal or cylindrical modules. These components can be simply represented by geometric primitives like plane, cuboid and cylinder. Based on this prior, we propose a robust automatic detection scheme to automatically detect such basic components of spacecraft in three-dimensional (3D) point clouds. In the proposed scheme, cylinders are first detected in the iteration of the energy-based geometric model fitting and cylinder parameter estimation. Then, planes are detected by Hough transform and further described as bounded patches with their minimum bounding rectangles. Finally, the cuboids are detected with pair-wise geometry relations from the detected patches. After successive detection of cylinders, planar patches and cuboids, a mid-level geometry representation of the spacecraft can be delivered. We tested the proposed component detection scheme on spacecraft 3D point clouds synthesized by computer-aided design (CAD) models and those recovered by image-based reconstruction, respectively. Experimental results illustrate that the proposed scheme can detect the basic geometric components effectively and has fine robustness against noise and point distribution density.
Robust Spacecraft Component Detection in Point Clouds
Wei, Quanmao; Jiang, Zhiguo
2018-01-01
Automatic component detection of spacecraft can assist in on-orbit operation and space situational awareness. Spacecraft are generally composed of solar panels and cuboidal or cylindrical modules. These components can be simply represented by geometric primitives like plane, cuboid and cylinder. Based on this prior, we propose a robust automatic detection scheme to automatically detect such basic components of spacecraft in three-dimensional (3D) point clouds. In the proposed scheme, cylinders are first detected in the iteration of the energy-based geometric model fitting and cylinder parameter estimation. Then, planes are detected by Hough transform and further described as bounded patches with their minimum bounding rectangles. Finally, the cuboids are detected with pair-wise geometry relations from the detected patches. After successive detection of cylinders, planar patches and cuboids, a mid-level geometry representation of the spacecraft can be delivered. We tested the proposed component detection scheme on spacecraft 3D point clouds synthesized by computer-aided design (CAD) models and those recovered by image-based reconstruction, respectively. Experimental results illustrate that the proposed scheme can detect the basic geometric components effectively and has fine robustness against noise and point distribution density. PMID:29561828
Robust range estimation with a monocular camera for vision-based forward collision warning system.
Park, Ki-Yeong; Hwang, Sun-Young
2014-01-01
We propose a range estimation method for vision-based forward collision warning systems with a monocular camera. To solve the problem of variation of camera pitch angle due to vehicle motion and road inclination, the proposed method estimates virtual horizon from size and position of vehicles in captured image at run-time. The proposed method provides robust results even when road inclination varies continuously on hilly roads or lane markings are not seen on crowded roads. For experiments, a vision-based forward collision warning system has been implemented and the proposed method is evaluated with video clips recorded in highway and urban traffic environments. Virtual horizons estimated by the proposed method are compared with horizons manually identified, and estimated ranges are compared with measured ranges. Experimental results confirm that the proposed method provides robust results both in highway and in urban traffic environments.
Robust Range Estimation with a Monocular Camera for Vision-Based Forward Collision Warning System
2014-01-01
We propose a range estimation method for vision-based forward collision warning systems with a monocular camera. To solve the problem of variation of camera pitch angle due to vehicle motion and road inclination, the proposed method estimates virtual horizon from size and position of vehicles in captured image at run-time. The proposed method provides robust results even when road inclination varies continuously on hilly roads or lane markings are not seen on crowded roads. For experiments, a vision-based forward collision warning system has been implemented and the proposed method is evaluated with video clips recorded in highway and urban traffic environments. Virtual horizons estimated by the proposed method are compared with horizons manually identified, and estimated ranges are compared with measured ranges. Experimental results confirm that the proposed method provides robust results both in highway and in urban traffic environments. PMID:24558344
Robust optimization based upon statistical theory.
Sobotta, B; Söhn, M; Alber, M
2010-08-01
Organ movement is still the biggest challenge in cancer treatment despite advances in online imaging. Due to the resulting geometric uncertainties, the delivered dose cannot be predicted precisely at treatment planning time. Consequently, all associated dose metrics (e.g., EUD and maxDose) are random variables with a patient-specific probability distribution. The method that the authors propose makes these distributions the basis of the optimization and evaluation process. The authors start from a model of motion derived from patient-specific imaging. On a multitude of geometry instances sampled from this model, a dose metric is evaluated. The resulting pdf of this dose metric is termed outcome distribution. The approach optimizes the shape of the outcome distribution based on its mean and variance. This is in contrast to the conventional optimization of a nominal value (e.g., PTV EUD) computed on a single geometry instance. The mean and variance allow for an estimate of the expected treatment outcome along with the residual uncertainty. Besides being applicable to the target, the proposed method also seamlessly includes the organs at risk (OARs). The likelihood that a given value of a metric is reached in the treatment is predicted quantitatively. This information reveals potential hazards that may occur during the course of the treatment, thus helping the expert to find the right balance between the risk of insufficient normal tissue sparing and the risk of insufficient tumor control. By feeding this information to the optimizer, outcome distributions can be obtained where the probability of exceeding a given OAR maximum and that of falling short of a given target goal can be minimized simultaneously. The method is applicable to any source of residual motion uncertainty in treatment delivery. Any model that quantifies organ movement and deformation in terms of probability distributions can be used as basis for the algorithm. Thus, it can generate dose distributions that are robust against interfraction and intrafraction motion alike, effectively removing the need for indiscriminate safety margins.
Distributive routing and congestion control in wireless multihop ad hoc communication networks
NASA Astrophysics Data System (ADS)
Glauche, Ingmar; Krause, Wolfram; Sollacher, Rudolf; Greiner, Martin
2004-10-01
Due to their inherent complexity, engineered wireless multihop ad hoc communication networks represent a technological challenge. Having no mastering infrastructure the nodes have to selforganize themselves in such a way that for example network connectivity, good data traffic performance and robustness are guaranteed. In this contribution the focus is on routing and congestion control. First, random data traffic along shortest path routes is studied by simulations as well as theoretical modeling. Measures of congestion like end-to-end time delay and relaxation times are given. A scaling law of the average time delay with respect to network size is revealed and found to depend on the underlying network topology. In the second step, a distributive routing and congestion control is proposed. Each node locally propagates its routing cost estimates and information about its congestion state to its neighbors, which then update their respective cost estimates. This allows for a flexible adaptation of end-to-end routes to the overall congestion state of the network. Compared to shortest-path routing, the critical network load is significantly increased.
NASA Astrophysics Data System (ADS)
Famiglietti, C.; Fisher, J.; Halverson, G. H.
2017-12-01
This study validates a method of remote sensing near-surface meteorology that vertically interpolates MODIS atmospheric profiles to surface pressure level. The extraction of air temperature and dew point observations at a two-meter reference height from 2001 to 2014 yields global moderate- to fine-resolution near-surface temperature distributions that are compared to geographically and temporally corresponding measurements from 114 ground meteorological stations distributed worldwide. This analysis is the first robust, large-scale validation of the MODIS-derived near-surface air temperature and dew point estimates, both of which serve as key inputs in models of energy, water, and carbon exchange between the land surface and the atmosphere. Results show strong linear correlations between remotely sensed and in-situ near-surface air temperature measurements (R2 = 0.89), as well as between dew point observations (R2 = 0.77). Performance is relatively uniform across climate zones. The extension of mean climate-wise percent errors to the entire remote sensing dataset allows for the determination of MODIS air temperature and dew point uncertainties on a global scale.
Liu, Hong; Wang, Jie; Xu, Xiangyang; Song, Enmin; Wang, Qian; Jin, Renchao; Hung, Chih-Cheng; Fei, Baowei
2014-11-01
A robust and accurate center-frequency (CF) estimation (RACE) algorithm for improving the performance of the local sine-wave modeling (SinMod) method, which is a good motion estimation method for tagged cardiac magnetic resonance (MR) images, is proposed in this study. The RACE algorithm can automatically, effectively and efficiently produce a very appropriate CF estimate for the SinMod method, under the circumstance that the specified tagging parameters are unknown, on account of the following two key techniques: (1) the well-known mean-shift algorithm, which can provide accurate and rapid CF estimation; and (2) an original two-direction-combination strategy, which can further enhance the accuracy and robustness of CF estimation. Some other available CF estimation algorithms are brought out for comparison. Several validation approaches that can work on the real data without ground truths are specially designed. Experimental results on human body in vivo cardiac data demonstrate the significance of accurate CF estimation for SinMod, and validate the effectiveness of RACE in facilitating the motion estimation performance of SinMod. Copyright © 2014 Elsevier Inc. All rights reserved.
Robustness of a distributed neural network controller for locomotion in a hexapod robot
NASA Technical Reports Server (NTRS)
Chiel, Hillel J.; Beer, Randall D.; Quinn, Roger D.; Espenschied, Kenneth S.
1992-01-01
A distributed neural-network controller for locomotion, based on insect neurobiology, has been used to control a hexapod robot. How robust is this controller? Disabling any single sensor, effector, or central component did not prevent the robot from walking. Furthermore, statically stable gaits could be established using either sensor input or central connections. Thus, a complex interplay between central neural elements and sensor inputs is responsible for the robustness of the controller and its ability to generate a continuous range of gaits. These results suggest that biologically inspired neural-network controllers may be a robust method for robotic control.
Robust Parallel Motion Estimation and Mapping with Stereo Cameras in Underground Infrastructure
NASA Astrophysics Data System (ADS)
Liu, Chun; Li, Zhengning; Zhou, Yuan
2016-06-01
Presently, we developed a novel robust motion estimation method for localization and mapping in underground infrastructure using a pre-calibrated rigid stereo camera rig. Localization and mapping in underground infrastructure is important to safety. Yet it's also nontrivial since most underground infrastructures have poor lighting condition and featureless structure. Overcoming these difficulties, we discovered that parallel system is more efficient than the EKF-based SLAM approach since parallel system divides motion estimation and 3D mapping tasks into separate threads, eliminating data-association problem which is quite an issue in SLAM. Moreover, the motion estimation thread takes the advantage of state-of-art robust visual odometry algorithm which is highly functional under low illumination and provides accurate pose information. We designed and built an unmanned vehicle and used the vehicle to collect a dataset in an underground garage. The parallel system was evaluated by the actual dataset. Motion estimation results indicated a relative position error of 0.3%, and 3D mapping results showed a mean position error of 13cm. Off-line process reduced position error to 2cm. Performance evaluation by actual dataset showed that our system is capable of robust motion estimation and accurate 3D mapping in poor illumination and featureless underground environment.
The 'robust' capture-recapture design allows components of recruitment to be estimated
Pollock, K.H.; Kendall, W.L.; Nichols, J.D.; Lebreton, J.-D.; North, P.M.
1993-01-01
The 'robust' capture-recapture design (Pollock 1982) allows analyses which combine features of closed population model analyses (Otis et aI., 1978, White et aI., 1982) and open population model analyses (Pollock et aI., 1990). Estimators obtained under these analyses are more robust to unequal catch ability than traditional Jolly-Seber estimators (Pollock, 1982; Pollock et al., 1990; Kendall, 1992). The robust design also allows estimation of parameters for population size, survival rate and recruitment numbers for all periods of the study unlike under Jolly-Seber type models. The major advantage of this design that we emphasize in this short review paper is that it allows separate estimation of immigration and in situ recruitment numbers for a two or more age class model (Nichols and Pollock, 1990). This is contrasted with the age-dependent Jolly-Seber model (Pollock, 1981; Stokes, 1984; Pollock et L, 1990) which provides separate estimates for immigration and in situ recruitment for all but the first two age classes where there is at least a three age class model. The ability to achieve this separation of recruitment components can be very important to population modelers and wildlife managers as many species can only be separated into two easily identified age classes in the field.
Extracting information in spike time patterns with wavelets and information theory.
Lopes-dos-Santos, Vítor; Panzeri, Stefano; Kayser, Christoph; Diamond, Mathew E; Quian Quiroga, Rodrigo
2015-02-01
We present a new method to assess the information carried by temporal patterns in spike trains. The method first performs a wavelet decomposition of the spike trains, then uses Shannon information to select a subset of coefficients carrying information, and finally assesses timing information in terms of decoding performance: the ability to identify the presented stimuli from spike train patterns. We show that the method allows: 1) a robust assessment of the information carried by spike time patterns even when this is distributed across multiple time scales and time points; 2) an effective denoising of the raster plots that improves the estimate of stimulus tuning of spike trains; and 3) an assessment of the information carried by temporally coordinated spikes across neurons. Using simulated data, we demonstrate that the Wavelet-Information (WI) method performs better and is more robust to spike time-jitter, background noise, and sample size than well-established approaches, such as principal component analysis, direct estimates of information from digitized spike trains, or a metric-based method. Furthermore, when applied to real spike trains from monkey auditory cortex and from rat barrel cortex, the WI method allows extracting larger amounts of spike timing information. Importantly, the fact that the WI method incorporates multiple time scales makes it robust to the choice of partly arbitrary parameters such as temporal resolution, response window length, number of response features considered, and the number of available trials. These results highlight the potential of the proposed method for accurate and objective assessments of how spike timing encodes information. Copyright © 2015 the American Physiological Society.
NASA Astrophysics Data System (ADS)
Baisden, W. T.; Canessa, S.
2013-01-01
In 1959, Athol Rafter began a substantial programme of systematically monitoring the flow of 14C produced by atmospheric thermonuclear tests through organic matter in New Zealand soils under stable land use. A database of ∼500 soil radiocarbon measurements spanning 50 years has now been compiled, and is used here to identify optimal approaches for soil C-cycle studies. Our results confirm the potential of 14C to determine residence times, by estimating the amount of ‘bomb 14C’ incorporated. High-resolution time series confirm this approach is appropriate, and emphasise that residence times can be calculated routinely with two or more time points as little as 10 years apart. This approach is generally robust to the key assumptions that can create large errors when single time-point 14C measurements are modelled. The three most critical assumptions relate to: (1) the distribution of turnover times, and particularly the proportion of old C (‘passive fraction’), (2) the lag time between photosynthesis and C entering the modelled pool, (3) changes in the rates of C input. When carrying out approaches using robust assumptions on time-series samples, multiple soil layers can be aggregated using a mixing equation. Where good archived samples are available, AMS measurements can develop useful understanding for calibrating models of the soil C cycle at regional to continental scales with sample numbers on the order of hundreds rather than thousands. Sample preparation laboratories and AMS facilities can play an important role in coordinating the efficient delivery of robust calculated residence times for soil carbon.
Efficient Levenberg-Marquardt minimization of the maximum likelihood estimator for Poisson deviates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Laurence, T; Chromy, B
2009-11-10
Histograms of counted events are Poisson distributed, but are typically fitted without justification using nonlinear least squares fitting. The more appropriate maximum likelihood estimator (MLE) for Poisson distributed data is seldom used. We extend the use of the Levenberg-Marquardt algorithm commonly used for nonlinear least squares minimization for use with the MLE for Poisson distributed data. In so doing, we remove any excuse for not using this more appropriate MLE. We demonstrate the use of the algorithm and the superior performance of the MLE using simulations and experiments in the context of fluorescence lifetime imaging. Scientists commonly form histograms ofmore » counted events from their data, and extract parameters by fitting to a specified model. Assuming that the probability of occurrence for each bin is small, event counts in the histogram bins will be distributed according to the Poisson distribution. We develop here an efficient algorithm for fitting event counting histograms using the maximum likelihood estimator (MLE) for Poisson distributed data, rather than the non-linear least squares measure. This algorithm is a simple extension of the common Levenberg-Marquardt (L-M) algorithm, is simple to implement, quick and robust. Fitting using a least squares measure is most common, but it is the maximum likelihood estimator only for Gaussian-distributed data. Non-linear least squares methods may be applied to event counting histograms in cases where the number of events is very large, so that the Poisson distribution is well approximated by a Gaussian. However, it is not easy to satisfy this criterion in practice - which requires a large number of events. It has been well-known for years that least squares procedures lead to biased results when applied to Poisson-distributed data; a recent paper providing extensive characterization of these biases in exponential fitting is given. The more appropriate measure based on the maximum likelihood estimator (MLE) for the Poisson distribution is also well known, but has not become generally used. This is primarily because, in contrast to non-linear least squares fitting, there has been no quick, robust, and general fitting method. In the field of fluorescence lifetime spectroscopy and imaging, there have been some efforts to use this estimator through minimization routines such as Nelder-Mead optimization, exhaustive line searches, and Gauss-Newton minimization. Minimization based on specific one- or multi-exponential models has been used to obtain quick results, but this procedure does not allow the incorporation of the instrument response, and is not generally applicable to models found in other fields. Methods for using the MLE for Poisson-distributed data have been published by the wider spectroscopic community, including iterative minimization schemes based on Gauss-Newton minimization. The slow acceptance of these procedures for fitting event counting histograms may also be explained by the use of the ubiquitous, fast Levenberg-Marquardt (L-M) fitting procedure for fitting non-linear models using least squares fitting (simple searches obtain {approx}10000 references - this doesn't include those who use it, but don't know they are using it). The benefits of L-M include a seamless transition between Gauss-Newton minimization and downward gradient minimization through the use of a regularization parameter. This transition is desirable because Gauss-Newton methods converge quickly, but only within a limited domain of convergence; on the other hand the downward gradient methods have a much wider domain of convergence, but converge extremely slowly nearer the minimum. L-M has the advantages of both procedures: relative insensitivity to initial parameters and rapid convergence. Scientists, when wanting an answer quickly, will fit data using L-M, get an answer, and move on. Only those that are aware of the bias issues will bother to fit using the more appropriate MLE for Poisson deviates. However, since there is a simple, analytical formula for the appropriate MLE measure for Poisson deviates, it is inexcusable that least squares estimators are used almost exclusively when fitting event counting histograms. There have been ways found to use successive non-linear least squares fitting to obtain similarly unbiased results, but this procedure is justified by simulation, must be re-tested when conditions change significantly, and requires two successive fits. There is a great need for a fitting routine for the MLE estimator for Poisson deviates that has convergence domains and rates comparable to the non-linear least squares L-M fitting. We show in this report that a simple way to achieve that goal is to use the L-M fitting procedure not to minimize the least squares measure, but the MLE for Poisson deviates.« less
Chu, Hui-May; Ette, Ene I
2005-09-02
his study was performed to develop a new nonparametric approach for the estimation of robust tissue-to-plasma ratio from extremely sparsely sampled paired data (ie, one sample each from plasma and tissue per subject). Tissue-to-plasma ratio was estimated from paired/unpaired experimental data using independent time points approach, area under the curve (AUC) values calculated with the naïve data averaging approach, and AUC values calculated using sampling based approaches (eg, the pseudoprofile-based bootstrap [PpbB] approach and the random sampling approach [our proposed approach]). The random sampling approach involves the use of a 2-phase algorithm. The convergence of the sampling/resampling approaches was investigated, as well as the robustness of the estimates produced by different approaches. To evaluate the latter, new data sets were generated by introducing outlier(s) into the real data set. One to 2 concentration values were inflated by 10% to 40% from their original values to produce the outliers. Tissue-to-plasma ratios computed using the independent time points approach varied between 0 and 50 across time points. The ratio obtained from AUC values acquired using the naive data averaging approach was not associated with any measure of uncertainty or variability. Calculating the ratio without regard to pairing yielded poorer estimates. The random sampling and pseudoprofile-based bootstrap approaches yielded tissue-to-plasma ratios with uncertainty and variability. However, the random sampling approach, because of the 2-phase nature of its algorithm, yielded more robust estimates and required fewer replications. Therefore, a 2-phase random sampling approach is proposed for the robust estimation of tissue-to-plasma ratio from extremely sparsely sampled data.
Robustness of Oscillatory Behavior in Correlated Networks
Sasai, Takeyuki; Morino, Kai; Tanaka, Gouhei; Almendral, Juan A.; Aihara, Kazuyuki
2015-01-01
Understanding network robustness against failures of network units is useful for preventing large-scale breakdowns and damages in real-world networked systems. The tolerance of networked systems whose functions are maintained by collective dynamical behavior of the network units has recently been analyzed in the framework called dynamical robustness of complex networks. The effect of network structure on the dynamical robustness has been examined with various types of network topology, but the role of network assortativity, or degree–degree correlations, is still unclear. Here we study the dynamical robustness of correlated (assortative and disassortative) networks consisting of diffusively coupled oscillators. Numerical analyses for the correlated networks with Poisson and power-law degree distributions show that network assortativity enhances the dynamical robustness of the oscillator networks but the impact of network disassortativity depends on the detailed network connectivity. Furthermore, we theoretically analyze the dynamical robustness of correlated bimodal networks with two-peak degree distributions and show the positive impact of the network assortativity. PMID:25894574
A new Bayesian Earthquake Analysis Tool (BEAT)
NASA Astrophysics Data System (ADS)
Vasyura-Bathke, Hannes; Dutta, Rishabh; Jónsson, Sigurjón; Mai, Martin
2017-04-01
Modern earthquake source estimation studies increasingly use non-linear optimization strategies to estimate kinematic rupture parameters, often considering geodetic and seismic data jointly. However, the optimization process is complex and consists of several steps that need to be followed in the earthquake parameter estimation procedure. These include pre-describing or modeling the fault geometry, calculating the Green's Functions (often assuming a layered elastic half-space), and estimating the distributed final slip and possibly other kinematic source parameters. Recently, Bayesian inference has become popular for estimating posterior distributions of earthquake source model parameters given measured/estimated/assumed data and model uncertainties. For instance, some research groups consider uncertainties of the layered medium and propagate these to the source parameter uncertainties. Other groups make use of informative priors to reduce the model parameter space. In addition, innovative sampling algorithms have been developed that efficiently explore the often high-dimensional parameter spaces. Compared to earlier studies, these improvements have resulted in overall more robust source model parameter estimates that include uncertainties. However, the computational demands of these methods are high and estimation codes are rarely distributed along with the published results. Even if codes are made available, it is often difficult to assemble them into a single optimization framework as they are typically coded in different programing languages. Therefore, further progress and future applications of these methods/codes are hampered, while reproducibility and validation of results has become essentially impossible. In the spirit of providing open-access and modular codes to facilitate progress and reproducible research in earthquake source estimations, we undertook the effort of producing BEAT, a python package that comprises all the above-mentioned features in one single programing environment. The package is build on top of the pyrocko seismological toolbox (www.pyrocko.org) and makes use of the pymc3 module for Bayesian statistical model fitting. BEAT is an open-source package (https://github.com/hvasbath/beat) and we encourage and solicit contributions to the project. In this contribution, we present our strategy for developing BEAT, show application examples, and discuss future developments.
Modeling chemical gradients in sediments under losing and gaining flow conditions: The GRADIENT code
NASA Astrophysics Data System (ADS)
Boano, Fulvio; De Falco, Natalie; Arnon, Shai
2018-02-01
Interfaces between sediments and water bodies often represent biochemical hotspots for nutrient reactions and are characterized by steep concentration gradients of different reactive solutes. Vertical profiles of these concentrations are routinely collected to obtain information on nutrient dynamics, and simple codes have been developed to analyze these profiles and determine the magnitude and distribution of reaction rates within sediments. However, existing publicly available codes do not consider the potential contribution of water flow in the sediments to nutrient transport, and their applications to field sites with significant water-borne nutrient fluxes may lead to large errors in the estimated reaction rates. To fill this gap, the present work presents GRADIENT, a novel algorithm to evaluate distributions of reaction rates from observed concentration profiles. GRADIENT is a Matlab code that extends a previously published framework to include the role of nutrient advection, and provides robust estimates of reaction rates in sediments with significant water flow. This work discusses the theoretical basis of the method and shows its performance by comparing the results to a series of synthetic data and to laboratory experiments. The results clearly show that in systems with losing or gaining fluxes, the inclusion of such fluxes is critical for estimating local and overall reaction rates in sediments.
High Resolution Deformation Time Series Estimation for Distributed Scatterers Using Terrasar-X Data
NASA Astrophysics Data System (ADS)
Goel, K.; Adam, N.
2012-07-01
In recent years, several SAR satellites such as TerraSAR-X, COSMO-SkyMed and Radarsat-2 have been launched. These satellites provide high resolution data suitable for sophisticated interferometric applications. With shorter repeat cycles, smaller orbital tubes and higher bandwidth of the satellites; deformation time series analysis of distributed scatterers (DSs) is now supported by a practical data basis. Techniques for exploiting DSs in non-urban (rural) areas include the Small Baseline Subset Algorithm (SBAS). However, it involves spatial phase unwrapping, and phase unwrapping errors are typically encountered in rural areas and are difficult to detect. In addition, the SBAS technique involves a rectangular multilooking of the differential interferograms to reduce phase noise, resulting in a loss of resolution and superposition of different objects on ground. In this paper, we introduce a new approach for deformation monitoring with a focus on DSs, wherein, there is no need to unwrap the differential interferograms and the deformation is mapped at object resolution. It is based on a robust object adaptive parameter estimation using single look differential interferograms, where, the local tilts of deformation velocity and local slopes of residual DEM in range and azimuth directions are estimated. We present here the technical details and a processing example of this newly developed algorithm.
Water age and stream solute dynamics at the Hubbard Brook Experimental Forest (US)
NASA Astrophysics Data System (ADS)
Botter, Gianluca; Benettin, Paolo; McGuire, Kevin; Rinaldo, Andrea
2016-04-01
The contribution discusses experimental and modeling results from a headwater catchment at the Hubbard Brook Experimental Forest (New Hampshire, USA) to explore the link between stream solute dynamics and water age. A theoretical framework based on water age dynamics, which represents a general basis for characterizing solute transport at the catchment scale, is used to model both conservative and weathering-derived solutes. Based on the available information about the hydrology of the site, an integrated transport model was developed and used to estimate the relevant hydrochemical fluxes. The model was designed to reproduce the deuterium content of streamflow and allowed for the estimate of catchment water storage and dynamic travel time distributions (TTDs). Within this framework, dissolved silicon and sodium concentration in streamflow were simulated by implementing first-order chemical kinetics based explicitly on dynamic TTD, thus upscaling local geochemical processes to catchment scale. Our results highlight the key role of water stored within the subsoil glacial material in both the short-term and long-term solute circulation at Hubbard Brook. The analysis of the results provided by the calibrated model allowed a robust estimate of the emerging concentration-discharge relationship, streamflow age distributions (including the fraction of event water) and storage size, and their evolution in time due to hydrologic variability.
A systematic review of waterborne disease burden methodologies from developed countries.
Murphy, H M; Pintar, K D M; McBean, E A; Thomas, M K
2014-12-01
The true incidence of endemic acute gastrointestinal illness (AGI) attributable to drinking water in Canada is unknown. Using a systematic review framework, the literature was evaluated to identify methods used to attribute AGI to drinking water. Several strategies have been suggested or applied to quantify AGI attributable to drinking water at a national level. These vary from simple point estimates, to quantitative microbial risk assessment, to Monte Carlo simulations, which rely on assumptions and epidemiological data from the literature. Using two methods proposed by researchers in the USA, this paper compares the current approaches and key assumptions. Knowledge gaps are identified to inform future waterborne disease attribution estimates. To improve future estimates, there is a need for robust epidemiological studies that quantify the health risks associated with small, private water systems, groundwater systems and the influence of distribution system intrusions on risk. Quantification of the occurrence of enteric pathogens in water supplies, particularly for groundwater, is needed. In addition, there are unanswered questions regarding the susceptibility of vulnerable sub-populations to these pathogens and the influence of extreme weather events (precipitation) on AGI-related health risks. National centralized data to quantify the proportions of the population served by different water sources, by treatment level, source water quality, and the condition of the distribution system infrastructure, are needed.
Real-time sensor validation and fusion for distributed autonomous sensors
NASA Astrophysics Data System (ADS)
Yuan, Xiaojing; Li, Xiangshang; Buckles, Bill P.
2004-04-01
Multi-sensor data fusion has found widespread applications in industrial and research sectors. The purpose of real time multi-sensor data fusion is to dynamically estimate an improved system model from a set of different data sources, i.e., sensors. This paper presented a systematic and unified real time sensor validation and fusion framework (RTSVFF) based on distributed autonomous sensors. The RTSVFF is an open architecture which consists of four layers - the transaction layer, the process fusion layer, the control layer, and the planning layer. This paradigm facilitates distribution of intelligence to the sensor level and sharing of information among sensors, controllers, and other devices in the system. The openness of the architecture also provides a platform to test different sensor validation and fusion algorithms and thus facilitates the selection of near optimal algorithms for specific sensor fusion application. In the version of the model presented in this paper, confidence weighted averaging is employed to address the dynamic system state issue noted above. The state is computed using an adaptive estimator and dynamic validation curve for numeric data fusion and a robust diagnostic map for decision level qualitative fusion. The framework is then applied to automatic monitoring of a gas-turbine engine, including a performance comparison of the proposed real-time sensor fusion algorithms and a traditional numerical weighted average.
Kirby, James B.; Bollen, Kenneth A.
2009-01-01
Structural Equation Modeling with latent variables (SEM) is a powerful tool for social and behavioral scientists, combining many of the strengths of psychometrics and econometrics into a single framework. The most common estimator for SEM is the full-information maximum likelihood estimator (ML), but there is continuing interest in limited information estimators because of their distributional robustness and their greater resistance to structural specification errors. However, the literature discussing model fit for limited information estimators for latent variable models is sparse compared to that for full information estimators. We address this shortcoming by providing several specification tests based on the 2SLS estimator for latent variable structural equation models developed by Bollen (1996). We explain how these tests can be used to not only identify a misspecified model, but to help diagnose the source of misspecification within a model. We present and discuss results from a Monte Carlo experiment designed to evaluate the finite sample properties of these tests. Our findings suggest that the 2SLS tests successfully identify most misspecified models, even those with modest misspecification, and that they provide researchers with information that can help diagnose the source of misspecification. PMID:20419054
Gething, Peter W; Patil, Anand P; Hay, Simon I
2010-04-01
Risk maps estimating the spatial distribution of infectious diseases are required to guide public health policy from local to global scales. The advent of model-based geostatistics (MBG) has allowed these maps to be generated in a formal statistical framework, providing robust metrics of map uncertainty that enhances their utility for decision-makers. In many settings, decision-makers require spatially aggregated measures over large regions such as the mean prevalence within a country or administrative region, or national populations living under different levels of risk. Existing MBG mapping approaches provide suitable metrics of local uncertainty--the fidelity of predictions at each mapped pixel--but have not been adapted for measuring uncertainty over large areas, due largely to a series of fundamental computational constraints. Here the authors present a new efficient approximating algorithm that can generate for the first time the necessary joint simulation of prevalence values across the very large prediction spaces needed for global scale mapping. This new approach is implemented in conjunction with an established model for P. falciparum allowing robust estimates of mean prevalence at any specified level of spatial aggregation. The model is used to provide estimates of national populations at risk under three policy-relevant prevalence thresholds, along with accompanying model-based measures of uncertainty. By overcoming previously unchallenged computational barriers, this study illustrates how MBG approaches, already at the forefront of infectious disease mapping, can be extended to provide large-scale aggregate measures appropriate for decision-makers.
Adolescent mental health and earnings inequalities in adulthood: evidence from the Young-HUNT Study.
Evensen, Miriam; Lyngstad, Torkild Hovde; Melkevik, Ole; Reneflot, Anne; Mykletun, Arnstein
2017-02-01
Previous studies have shown that adolescent mental health problems are associated with lower employment probabilities and risk of unemployment. The evidence on how earnings are affected is much weaker, and few have addressed whether any association reflects unobserved characteristics and whether the consequences of mental health problems vary across the earnings distribution. A population-based Norwegian health survey linked to administrative registry data (N=7885) was used to estimate how adolescents' mental health problems (separate indicators of internalising, conduct, and attention problems and total sum scores) affect earnings (≥30 years) in young adulthood. We used linear regression with fixed-effects models comparing either students within schools or siblings within families. Unconditional quantile regressions were used to explore differentials across the earnings distribution. Mental health problems in adolescence reduce average earnings in adulthood, and associations are robust to control for observed family background and school fixed effects. For some, but not all mental health problems, associations are also robust in sibling fixed-effects models, where all stable family factors are controlled. Further, we found much larger earnings loss below the 25th centile. Adolescent mental health problems reduce adult earnings, especially among individuals in the lower tail of the earnings distribution. Preventing mental health problems in adolescence may increase future earnings. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
A comparison of solute-transport solution techniques based on inverse modelling results
Mehl, S.; Hill, M.C.
2000-01-01
Five common numerical techniques (finite difference, predictor-corrector, total-variation-diminishing, method-of-characteristics, and modified-method-of-characteristics) were tested using simulations of a controlled conservative tracer-test experiment through a heterogeneous, two-dimensional sand tank. The experimental facility was constructed using randomly distributed homogeneous blocks of five sand types. This experimental model provides an outstanding opportunity to compare the solution techniques because of the heterogeneous hydraulic conductivity distribution of known structure, and the availability of detailed measurements with which to compare simulated concentrations. The present work uses this opportunity to investigate how three common types of results-simulated breakthrough curves, sensitivity analysis, and calibrated parameter values-change in this heterogeneous situation, given the different methods of simulating solute transport. The results show that simulated peak concentrations, even at very fine grid spacings, varied because of different amounts of numerical dispersion. Sensitivity analysis results were robust in that they were independent of the solution technique. They revealed extreme correlation between hydraulic conductivity and porosity, and that the breakthrough curve data did not provide enough information about the dispersivities to estimate individual values for the five sands. However, estimated hydraulic conductivity values are significantly influenced by both the large possible variations in model dispersion and the amount of numerical dispersion present in the solution technique.Five common numerical techniques (finite difference, predictor-corrector, total-variation-diminishing, method-of-characteristics, and modified-method-of-characteristics) were tested using simulations of a controlled conservative tracer-test experiment through a heterogeneous, two-dimensional sand tank. The experimental facility was constructed using randomly distributed homogeneous blocks of five sand types. This experimental model provides an outstanding opportunity to compare the solution techniques because of the heterogeneous hydraulic conductivity distribution of known structure, and the availability of detailed measurements with which to compare simulated concentrations. The present work uses this opportunity to investigate how three common types of results - simulated breakthrough curves, sensitivity analysis, and calibrated parameter values - change in this heterogeneous situation, given the different methods of simulating solute transport. The results show that simulated peak concentrations, even at very fine grid spacings, varied because of different amounts of numerical dispersion. Sensitivity analysis results were robust in that they were independent of the solution technique. They revealed extreme correlation between hydraulic conductivity and porosity, and that the breakthrough curve data did not provide enough information about the dispersivities to estimate individual values for the five sands. However, estimated hydraulic conductivity values are significantly influenced by both the large possible variations in model dispersion and the amount of numerical dispersion present in the solution technique.
Mohammadi, Mohammad Hossein; Vanclooster, Marnik
2012-05-01
Solute transport in partially saturated soils is largely affected by fluid velocity distribution and pore size distribution within the solute transport domain. Hence, it is possible to describe the solute transport process in terms of the pore size distribution of the soil, and indirectly in terms of the soil hydraulic properties. In this paper, we present a conceptual approach that allows predicting the parameters of the Convective Lognormal Transfer model from knowledge of soil moisture and the Soil Moisture Characteristic (SMC), parameterized by means of the closed-form model of Kosugi (1996). It is assumed that in partially saturated conditions, the air filled pore volume act as an inert solid phase, allowing the use of the Arya et al. (1999) pragmatic approach to estimate solute travel time statistics from the saturation degree and SMC parameters. The approach is evaluated using a set of partially saturated transport experiments as presented by Mohammadi and Vanclooster (2011). Experimental results showed that the mean solute travel time, μ(t), increases proportionally with the depth (travel distance) and decreases with flow rate. The variance of solute travel time σ²(t) first decreases with flow rate up to 0.4-0.6 Ks and subsequently increases. For all tested BTCs predicted solute transport with μ(t) estimated from the conceptual model performed much better as compared to predictions with μ(t) and σ²(t) estimated from calibration of solute transport at shallow soil depths. The use of μ(t) estimated from the conceptual model therefore increases the robustness of the CLT model in predicting solute transport in heterogeneous soils at larger depths. In view of the fact that reasonable indirect estimates of the SMC can be made from basic soil properties using pedotransfer functions, the presented approach may be useful for predicting solute transport at field or watershed scales. Copyright © 2012 Elsevier B.V. All rights reserved.
The end of trend-estimation for extreme floods under climate change?
NASA Astrophysics Data System (ADS)
Schulz, Karsten; Bernhardt, Matthias
2016-04-01
An increased risk of flood events is one of the major threats under future climate change conditions. Therefore, many recent studies have investigated trends in flood extreme occurences using historic long-term river discharge data as well as simulations from combined global/regional climate and hydrological models. Severe floods are relatively rare events and the robust estimation of their probability of occurrence requires long time series of data (6). Following a method outlined by the IPCC research community, trends in extreme floods are calculated based on the difference of discharge values exceeding e.g. a 100-year level (Q100) between two 30-year windows, which represents prevailing conditions in a reference and a future time period, respectively. Following this approach, we analysed multiple, synthetically derived 2,000-year trend-free, yearly maximum runoff data generated using three different extreme value distributions (EDV). The parameters were estimated from long term runoff data of four large European watersheds (Danube, Elbe, Rhine, Thames). Both, Q100-values estimated from 30-year moving windows, as well as the subsequently derived trends showed enormous variations with time: for example, estimating the Extreme Value (Gumbel) - distribution for the Danube data, trends of Q100 in the synthetic time-series range from -4,480 to 4,028 m³/s per 100 years (Q100 =10,071m³/s, for reference). Similar results were found when applying other extreme value distributions (Weibull, and log-Normal) to all of the watersheds considered. This variability or "background noise" of estimating trends in flood extremes makes it almost impossible to significantly distinguish any real trend in observed as well as modelled data when such an approach is applied. These uncertainties, even though known in principle are hardly addressed and discussed by the climate change impact community. Any decision making and flood risk management, including the dimensioning of flood protection measures, that is based on such studies might therefore be fundamentally flawed.
NASA Astrophysics Data System (ADS)
Mohammadi, Mohammad Hossein; Vanclooster, Marnik
2012-05-01
Solute transport in partially saturated soils is largely affected by fluid velocity distribution and pore size distribution within the solute transport domain. Hence, it is possible to describe the solute transport process in terms of the pore size distribution of the soil, and indirectly in terms of the soil hydraulic properties. In this paper, we present a conceptual approach that allows predicting the parameters of the Convective Lognormal Transfer model from knowledge of soil moisture and the Soil Moisture Characteristic (SMC), parameterized by means of the closed-form model of Kosugi (1996). It is assumed that in partially saturated conditions, the air filled pore volume act as an inert solid phase, allowing the use of the Arya et al. (1999) pragmatic approach to estimate solute travel time statistics from the saturation degree and SMC parameters. The approach is evaluated using a set of partially saturated transport experiments as presented by Mohammadi and Vanclooster (2011). Experimental results showed that the mean solute travel time, μt, increases proportionally with the depth (travel distance) and decreases with flow rate. The variance of solute travel time σ2t first decreases with flow rate up to 0.4-0.6 Ks and subsequently increases. For all tested BTCs predicted solute transport with μt estimated from the conceptual model performed much better as compared to predictions with μt and σ2t estimated from calibration of solute transport at shallow soil depths. The use of μt estimated from the conceptual model therefore increases the robustness of the CLT model in predicting solute transport in heterogeneous soils at larger depths. In view of the fact that reasonable indirect estimates of the SMC can be made from basic soil properties using pedotransfer functions, the presented approach may be useful for predicting solute transport at field or watershed scales.
Doubly Robust Additive Hazards Models to Estimate Effects of a Continuous Exposure on Survival.
Wang, Yan; Lee, Mihye; Liu, Pengfei; Shi, Liuhua; Yu, Zhi; Abu Awad, Yara; Zanobetti, Antonella; Schwartz, Joel D
2017-11-01
The effect of an exposure on survival can be biased when the regression model is misspecified. Hazard difference is easier to use in risk assessment than hazard ratio and has a clearer interpretation in the assessment of effect modifications. We proposed two doubly robust additive hazards models to estimate the causal hazard difference of a continuous exposure on survival. The first model is an inverse probability-weighted additive hazards regression. The second model is an extension of the doubly robust estimator for binary exposures by categorizing the continuous exposure. We compared these with the marginal structural model and outcome regression with correct and incorrect model specifications using simulations. We applied doubly robust additive hazard models to the estimation of hazard difference of long-term exposure to PM2.5 (particulate matter with an aerodynamic diameter less than or equal to 2.5 microns) on survival using a large cohort of 13 million older adults residing in seven states of the Southeastern United States. We showed that the proposed approaches are doubly robust. We found that each 1 μg m increase in annual PM2.5 exposure was associated with a causal hazard difference in mortality of 8.0 × 10 (95% confidence interval 7.4 × 10, 8.7 × 10), which was modified by age, medical history, socioeconomic status, and urbanicity. The overall hazard difference translates to approximately 5.5 (5.1, 6.0) thousand deaths per year in the study population. The proposed approaches improve the robustness of the additive hazards model and produce a novel additive causal estimate of PM2.5 on survival and several additive effect modifications, including social inequality.
Variability of space climate and its extremes with successive solar cycles
NASA Astrophysics Data System (ADS)
Chapman, Sandra; Hush, Phillip; Tindale, Elisabeth; Dunlop, Malcolm; Watkins, Nicholas
2016-04-01
Auroral geomagnetic indices coupled with in situ solar wind monitors provide a comprehensive data set, spanning several solar cycles. Space climate can be considered as the distribution of space weather. We can then characterize these observations in terms of changing space climate by quantifying how the statistical properties of ensembles of these observed variables vary between different phases of the solar cycle. We first consider the AE index burst distribution. Bursts are constructed by thresholding the AE time series; the size of a burst is the sum of the excess in the time series for each time interval over which the threshold is exceeded. The distribution of burst sizes is two component with a crossover in behaviour at thresholds ≈ 1000 nT. Above this threshold, we find[1] a range over which the mean burst size is almost constant with threshold for both solar maxima and minima. The burst size distribution of the largest events has a functional form which is exponential. The relative likelihood of these large events varies from one solar maximum and minimum to the next. If the relative overall activity of a solar maximum/minimum can be estimated, these results then constrain the likelihood of extreme events of a given size for that solar maximum/minimum. We next develop and apply a methodology to quantify how the full distribution of geomagnetic indices and upstream solar wind observables are changing between and across different solar cycles. This methodology[2] estimates how different quantiles of the distribution, or equivalently, how the return times of events of a given size, are changing. [1] Hush, P., S. C. Chapman, M. W. Dunlop, and N. W. Watkins (2015), Robust statistical properties of the size of large burst events in AE, Geophys. Res. Lett.,42 doi:10.1002/2015GL066277 [2] Chapman, S. C., D. A. Stainforth, N. W. Watkins, (2013) On estimating long term local climate trends , Phil. Trans. Royal Soc., A,371 20120287 DOI:10.1098/rsta.2012.0287
Local Estimators for Spacecraft Formation Flying
NASA Technical Reports Server (NTRS)
Fathpour, Nanaz; Hadaegh, Fred Y.; Mesbahi, Mehran; Nabi, Marzieh
2011-01-01
A formation estimation architecture for formation flying builds upon the local information exchange among multiple local estimators. Spacecraft formation flying involves the coordination of states among multiple spacecraft through relative sensing, inter-spacecraft communication, and control. Most existing formation flying estimation algorithms can only be supported via highly centralized, all-to-all, static relative sensing. New algorithms are needed that are scalable, modular, and robust to variations in the topology and link characteristics of the formation exchange network. These distributed algorithms should rely on a local information-exchange network, relaxing the assumptions on existing algorithms. In this research, it was shown that only local observability is required to design a formation estimator and control law. The approach relies on breaking up the overall information-exchange network into sequence of local subnetworks, and invoking an agreement-type filter to reach consensus among local estimators within each local network. State estimates were obtained by a set of local measurements that were passed through a set of communicating Kalman filters to reach an overall state estimation for the formation. An optimization approach was also presented by means of which diffused estimates over the network can be incorporated in the local estimates obtained by each estimator via local measurements. This approach compares favorably with that obtained by a centralized Kalman filter, which requires complete knowledge of the raw measurement available to each estimator.
Rasmussen, Peter M.; Smith, Amy F.; Sakadžić, Sava; Boas, David A.; Pries, Axel R.; Secomb, Timothy W.; Østergaard, Leif
2017-01-01
Objective In vivo imaging of the microcirculation and network-oriented modeling have emerged as powerful means of studying microvascular function and understanding its physiological significance. Network-oriented modeling may provide the means of summarizing vast amounts of data produced by high-throughput imaging techniques in terms of key, physiological indices. To estimate such indices with sufficient certainty, however, network-oriented analysis must be robust to the inevitable presence of uncertainty due to measurement errors as well as model errors. Methods We propose the Bayesian probabilistic data analysis framework as a means of integrating experimental measurements and network model simulations into a combined and statistically coherent analysis. The framework naturally handles noisy measurements and provides posterior distributions of model parameters as well as physiological indices associated with uncertainty. Results We applied the analysis framework to experimental data from three rat mesentery networks and one mouse brain cortex network. We inferred distributions for more than five hundred unknown pressure and hematocrit boundary conditions. Model predictions were consistent with previous analyses, and remained robust when measurements were omitted from model calibration. Conclusion Our Bayesian probabilistic approach may be suitable for optimizing data acquisition and for analyzing and reporting large datasets acquired as part of microvascular imaging studies. PMID:27987383
Liu, Yanchi; Wang, Xue; Liu, Youda; Cui, Sujin
2016-06-27
Power quality analysis issues, especially the measurement of harmonic and interharmonic in cyber-physical energy systems, are addressed in this paper. As new situations are introduced to the power system, the impact of electric vehicles, distributed generation and renewable energy has introduced extra demands to distributed sensors, waveform-level information and power quality data analytics. Harmonics and interharmonics, as the most significant disturbances, require carefully designed detection methods for an accurate measurement of electric loads whose information is crucial to subsequent analyzing and control. This paper gives a detailed description of the power quality analysis framework in networked environment and presents a fast and resolution-enhanced method for harmonic and interharmonic measurement. The proposed method first extracts harmonic and interharmonic components efficiently using the single-channel version of Robust Independent Component Analysis (RobustICA), then estimates the high-resolution frequency from three discrete Fourier transform (DFT) samples with little additional computation, and finally computes the amplitudes and phases with the adaptive linear neuron network. The experiments show that the proposed method is time-efficient and leads to a better accuracy of the simulated and experimental signals in the presence of noise and fundamental frequency deviation, thus providing a deeper insight into the (inter)harmonic sources or even the whole system.
Liu, Yanchi; Wang, Xue; Liu, Youda; Cui, Sujin
2016-01-01
Power quality analysis issues, especially the measurement of harmonic and interharmonic in cyber-physical energy systems, are addressed in this paper. As new situations are introduced to the power system, the impact of electric vehicles, distributed generation and renewable energy has introduced extra demands to distributed sensors, waveform-level information and power quality data analytics. Harmonics and interharmonics, as the most significant disturbances, require carefully designed detection methods for an accurate measurement of electric loads whose information is crucial to subsequent analyzing and control. This paper gives a detailed description of the power quality analysis framework in networked environment and presents a fast and resolution-enhanced method for harmonic and interharmonic measurement. The proposed method first extracts harmonic and interharmonic components efficiently using the single-channel version of Robust Independent Component Analysis (RobustICA), then estimates the high-resolution frequency from three discrete Fourier transform (DFT) samples with little additional computation, and finally computes the amplitudes and phases with the adaptive linear neuron network. The experiments show that the proposed method is time-efficient and leads to a better accuracy of the simulated and experimental signals in the presence of noise and fundamental frequency deviation, thus providing a deeper insight into the (inter)harmonic sources or even the whole system. PMID:27355946
Tools of Robustness for Item Response Theory.
ERIC Educational Resources Information Center
Jones, Douglas H.
This paper briefly demonstrates a few of the possibilities of a systematic application of robustness theory, concentrating on the estimation of ability when the true item response model does and does not fit the data. The definition of the maximum likelihood estimator (MLE) of ability is briefly reviewed. After introducing the notion of…
The Robustness of LISREL Estimates in Structural Equation Models with Categorical Variables.
ERIC Educational Resources Information Center
Ethington, Corinna A.
This study examined the effect of type of correlation matrix on the robustness of LISREL maximum likelihood and unweighted least squares structural parameter estimates for models with categorical manifest variables. Two types of correlation matrices were analyzed; one containing Pearson product-moment correlations and one containing tetrachoric,…
NASA Astrophysics Data System (ADS)
Ablay, Gunyaz
Using traditional control methods for controller design, parameter estimation and fault diagnosis may lead to poor results with nuclear systems in practice because of approximations and uncertainties in the system models used, possibly resulting in unexpected plant unavailability. This experience has led to an interest in development of robust control, estimation and fault diagnosis methods. One particularly robust approach is the sliding mode control methodology. Sliding mode approaches have been of great interest and importance in industry and engineering in the recent decades due to their potential for producing economic, safe and reliable designs. In order to utilize these advantages, sliding mode approaches are implemented for robust control, state estimation, secure communication and fault diagnosis in nuclear plant systems. In addition, a sliding mode output observer is developed for fault diagnosis in dynamical systems. To validate the effectiveness of the methodologies, several nuclear plant system models are considered for applications, including point reactor kinetics, xenon concentration dynamics, an uncertain pressurizer model, a U-tube steam generator model and a coupled nonlinear nuclear reactor model.
NASA Astrophysics Data System (ADS)
Friedel, M. J.; Daughney, C.
2016-12-01
The development of a successful surface-groundwater management strategy depends on the quality of data provided for analysis. This study evaluates the statistical robustness when using a modified self-organizing map (MSOM) technique to estimate missing values for three hypersurface models: synoptic groundwater-surface water hydrochemistry, time-series of groundwater-surface water hydrochemistry, and mixed-survey (combination of groundwater-surface water hydrochemistry and lithologies) hydrostratigraphic unit data. These models of increasing complexity are developed and validated based on observations from the Southland region of New Zealand. In each case, the estimation method is sufficiently robust to cope with groundwater-surface water hydrochemistry vagaries due to sample size and extreme data insufficiency, even when >80% of the data are missing. The estimation of surface water hydrochemistry time series values enabled the evaluation of seasonal variation, and the imputation of lithologies facilitated the evaluation of hydrostratigraphic controls on groundwater-surface water interaction. The robust statistical results for groundwater-surface water models of increasing data complexity provide justification to apply the MSOM technique in other regions of New Zealand and abroad.
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Networked buffering: a basic mechanism for distributed robustness in complex adaptive systems.
Whitacre, James M; Bender, Axel
2010-06-15
A generic mechanism--networked buffering--is proposed for the generation of robust traits in complex systems. It requires two basic conditions to be satisfied: 1) agents are versatile enough to perform more than one single functional role within a system and 2) agents are degenerate, i.e. there exists partial overlap in the functional capabilities of agents. Given these prerequisites, degenerate systems can readily produce a distributed systemic response to local perturbations. Reciprocally, excess resources related to a single function can indirectly support multiple unrelated functions within a degenerate system. In models of genome:proteome mappings for which localized decision-making and modularity of genetic functions are assumed, we verify that such distributed compensatory effects cause enhanced robustness of system traits. The conditions needed for networked buffering to occur are neither demanding nor rare, supporting the conjecture that degeneracy may fundamentally underpin distributed robustness within several biotic and abiotic systems. For instance, networked buffering offers new insights into systems engineering and planning activities that occur under high uncertainty. It may also help explain recent developments in understanding the origins of resilience within complex ecosystems.
NASA Astrophysics Data System (ADS)
Watkinson, Catherine A.; Majumdar, Suman; Pritchard, Jonathan R.; Mondal, Rajesh
2017-12-01
In this paper, we establish the accuracy and robustness of a fast estimator for the bispectrum - the 'FFT-bispectrum estimator'. The implementation of the estimator presented here offers speed and simplicity benefits over a direct-measurement approach. We also generalize the derivation so it may be easily be applied to any order polyspectra, such as the trispectrum, with the cost of only a handful of Fast-Fourier Transforms (FFTs). All lower order statistics can also be calculated simultaneously for little extra cost. To test the estimator, we make use of a non-linear density field, and for a more strongly non-Gaussian test case, we use a toy-model of reionization in which ionized bubbles at a given redshift are all of equal size and are randomly distributed. Our tests find that the FFT-estimator remains accurate over a wide range of k, and so should be extremely useful for analysis of 21-cm observations. The speed of the FFT-bispectrum estimator makes it suitable for sampling applications, such as Bayesian inference. The algorithm we describe should prove valuable in the analysis of simulations and observations, and whilst, we apply it within the field of cosmology, this estimator is useful in any field that deals with non-Gaussian data.
Seismic clusters analysis in Northeastern Italy by the nearest-neighbor approach
NASA Astrophysics Data System (ADS)
Peresan, Antonella; Gentili, Stefania
2018-01-01
The main features of earthquake clusters in Northeastern Italy are explored, with the aim to get new insights on local scale patterns of seismicity in the area. The study is based on a systematic analysis of robustly and uniformly detected seismic clusters, which are identified by a statistical method, based on nearest-neighbor distances of events in the space-time-energy domain. The method permits us to highlight and investigate the internal structure of earthquake sequences, and to differentiate the spatial properties of seismicity according to the different topological features of the clusters structure. To analyze seismicity of Northeastern Italy, we use information from local OGS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. A preliminary reappraisal of the earthquake bulletins is carried out and the area of sufficient completeness is outlined. Various techniques are considered to estimate the scaling parameters that characterize earthquakes occurrence in the region, namely the b-value and the fractal dimension of epicenters distribution, required for the application of the nearest-neighbor technique. Specifically, average robust estimates of the parameters of the Unified Scaling Law for Earthquakes, USLE, are assessed for the whole outlined region and are used to compute the nearest-neighbor distances. Clusters identification by the nearest-neighbor method turn out quite reliable and robust with respect to the minimum magnitude cutoff of the input catalog; the identified clusters are well consistent with those obtained from manual aftershocks identification of selected sequences. We demonstrate that the earthquake clusters have distinct preferred geographic locations, and we identify two areas that differ substantially in the examined clustering properties. Specifically, burst-like sequences are associated with the north-western part and swarm-like sequences with the south-eastern part of the study region. The territorial heterogeneity of earthquakes clustering is in good agreement with spatial variability of scaling parameters identified by the USLE. In particular, the fractal dimension is higher to the west (about 1.2-1.4), suggesting a spatially more distributed seismicity, compared to the eastern parte of the investigated territory, where fractal dimension is very low (about 0.8-1.0).
Robust power spectral estimation for EEG data
Melman, Tamar; Victor, Jonathan D.
2016-01-01
Background Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates. New method Using the multitaper method[1] as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox. Results Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors. Comparison to existing method The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor. Conclusion In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation. PMID:27102041
Robust power spectral estimation for EEG data.
Melman, Tamar; Victor, Jonathan D
2016-08-01
Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates. Using the multitaper method (Thomson, 1982) as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox. Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors. The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor. In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation. Copyright © 2016 Elsevier B.V. All rights reserved.
Zhu, Bangyan; Li, Jiancheng; Chu, Zhengwei; Tang, Wei; Wang, Bin; Li, Dawei
2016-01-01
Spatial and temporal variations in the vertical stratification of the troposphere introduce significant propagation delays in interferometric synthetic aperture radar (InSAR) observations. Observations of small amplitude surface deformations and regional subsidence rates are plagued by tropospheric delays, and strongly correlated with topographic height variations. Phase-based tropospheric correction techniques assuming a linear relationship between interferometric phase and topography have been exploited and developed, with mixed success. Producing robust estimates of tropospheric phase delay however plays a critical role in increasing the accuracy of InSAR measurements. Meanwhile, few phase-based correction methods account for the spatially variable tropospheric delay over lager study regions. Here, we present a robust and multi-weighted approach to estimate the correlation between phase and topography that is relatively insensitive to confounding processes such as regional subsidence over larger regions as well as under varying tropospheric conditions. An expanded form of robust least squares is introduced to estimate the spatially variable correlation between phase and topography by splitting the interferograms into multiple blocks. Within each block, correlation is robustly estimated from the band-filtered phase and topography. Phase-elevation ratios are multiply- weighted and extrapolated to each persistent scatter (PS) pixel. We applied the proposed method to Envisat ASAR images over the Southern California area, USA, and found that our method mitigated the atmospheric noise better than the conventional phase-based method. The corrected ground surface deformation agreed better with those measured from GPS. PMID:27420066
Zhu, Bangyan; Li, Jiancheng; Chu, Zhengwei; Tang, Wei; Wang, Bin; Li, Dawei
2016-07-12
Spatial and temporal variations in the vertical stratification of the troposphere introduce significant propagation delays in interferometric synthetic aperture radar (InSAR) observations. Observations of small amplitude surface deformations and regional subsidence rates are plagued by tropospheric delays, and strongly correlated with topographic height variations. Phase-based tropospheric correction techniques assuming a linear relationship between interferometric phase and topography have been exploited and developed, with mixed success. Producing robust estimates of tropospheric phase delay however plays a critical role in increasing the accuracy of InSAR measurements. Meanwhile, few phase-based correction methods account for the spatially variable tropospheric delay over lager study regions. Here, we present a robust and multi-weighted approach to estimate the correlation between phase and topography that is relatively insensitive to confounding processes such as regional subsidence over larger regions as well as under varying tropospheric conditions. An expanded form of robust least squares is introduced to estimate the spatially variable correlation between phase and topography by splitting the interferograms into multiple blocks. Within each block, correlation is robustly estimated from the band-filtered phase and topography. Phase-elevation ratios are multiply- weighted and extrapolated to each persistent scatter (PS) pixel. We applied the proposed method to Envisat ASAR images over the Southern California area, USA, and found that our method mitigated the atmospheric noise better than the conventional phase-based method. The corrected ground surface deformation agreed better with those measured from GPS.
On the robustness of EC-PC spike detection method for online neural recording.
Zhou, Yin; Wu, Tong; Rastegarnia, Amir; Guan, Cuntai; Keefer, Edward; Yang, Zhi
2014-09-30
Online spike detection is an important step to compress neural data and perform real-time neural information decoding. An unsupervised, automatic, yet robust signal processing is strongly desired, thus it can support a wide range of applications. We have developed a novel spike detection algorithm called "exponential component-polynomial component" (EC-PC) spike detection. We firstly evaluate the robustness of the EC-PC spike detector under different firing rates and SNRs. Secondly, we show that the detection Precision can be quantitatively derived without requiring additional user input parameters. We have realized the algorithm (including training) into a 0.13 μm CMOS chip, where an unsupervised, nonparametric operation has been demonstrated. Both simulated data and real data are used to evaluate the method under different firing rates (FRs), SNRs. The results show that the EC-PC spike detector is the most robust in comparison with some popular detectors. Moreover, the EC-PC detector can track changes in the background noise due to the ability to re-estimate the neural data distribution. Both real and synthesized data have been used for testing the proposed algorithm in comparison with other methods, including the absolute thresholding detector (AT), median absolute deviation detector (MAD), nonlinear energy operator detector (NEO), and continuous wavelet detector (CWD). Comparative testing results reveals that the EP-PC detection algorithm performs better than the other algorithms regardless of recording conditions. The EC-PC spike detector can be considered as an unsupervised and robust online spike detection. It is also suitable for hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Olafsdottir, Kristin B.; Mudelsee, Manfred
2013-04-01
Estimation of the Pearson's correlation coefficient between two time series to evaluate the influences of one time depended variable on another is one of the most often used statistical method in climate sciences. Various methods are used to estimate confidence interval to support the correlation point estimate. Many of them make strong mathematical assumptions regarding distributional shape and serial correlation, which are rarely met. More robust statistical methods are needed to increase the accuracy of the confidence intervals. Bootstrap confidence intervals are estimated in the Fortran 90 program PearsonT (Mudelsee, 2003), where the main intention was to get an accurate confidence interval for correlation coefficient between two time series by taking the serial dependence of the process that generated the data into account. However, Monte Carlo experiments show that the coverage accuracy for smaller data sizes can be improved. Here we adapt the PearsonT program into a new version called PearsonT3, by calibrating the confidence interval to increase the coverage accuracy. Calibration is a bootstrap resampling technique, which basically performs a second bootstrap loop or resamples from the bootstrap resamples. It offers, like the non-calibrated bootstrap confidence intervals, robustness against the data distribution. Pairwise moving block bootstrap is used to preserve the serial correlation of both time series. The calibration is applied to standard error based bootstrap Student's t confidence intervals. The performances of the calibrated confidence intervals are examined with Monte Carlo simulations, and compared with the performances of confidence intervals without calibration, that is, PearsonT. The coverage accuracy is evidently better for the calibrated confidence intervals where the coverage error is acceptably small (i.e., within a few percentage points) already for data sizes as small as 20. One form of climate time series is output from numerical models which simulate the climate system. The method is applied to model data from the high resolution ocean model, INALT01 where the relationship between the Agulhas Leakage and the North Brazil Current is evaluated. Preliminary results show significant correlation between the two variables when there is 10 year lag between them, which is more or less the time that takes the Agulhas Leakage water to reach the North Brazil Current. Mudelsee, M., 2003. Estimating Pearson's correlation coefficient with bootstrap confidence interval from serially dependent time series. Mathematical Geology 35, 651-665.
Wavelet Filtering to Reduce Conservatism in Aeroservoelastic Robust Stability Margins
NASA Technical Reports Server (NTRS)
Brenner, Marty; Lind, Rick
1998-01-01
Wavelet analysis for filtering and system identification was used to improve the estimation of aeroservoelastic stability margins. The conservatism of the robust stability margins was reduced with parametric and nonparametric time-frequency analysis of flight data in the model validation process. Nonparametric wavelet processing of data was used to reduce the effects of external desirableness and unmodeled dynamics. Parametric estimates of modal stability were also extracted using the wavelet transform. Computation of robust stability margins for stability boundary prediction depends on uncertainty descriptions derived from the data for model validation. F-18 high Alpha Research Vehicle aeroservoelastic flight test data demonstrated improved robust stability prediction by extension of the stability boundary beyond the flight regime.
A Computational Framework for Analyzing Stochasticity in Gene Expression
Sherman, Marc S.; Cohen, Barak A.
2014-01-01
Stochastic fluctuations in gene expression give rise to distributions of protein levels across cell populations. Despite a mounting number of theoretical models explaining stochasticity in protein expression, we lack a robust, efficient, assumption-free approach for inferring the molecular mechanisms that underlie the shape of protein distributions. Here we propose a method for inferring sets of biochemical rate constants that govern chromatin modification, transcription, translation, and RNA and protein degradation from stochasticity in protein expression. We asked whether the rates of these underlying processes can be estimated accurately from protein expression distributions, in the absence of any limiting assumptions. To do this, we (1) derived analytical solutions for the first four moments of the protein distribution, (2) found that these four moments completely capture the shape of protein distributions, and (3) developed an efficient algorithm for inferring gene expression rate constants from the moments of protein distributions. Using this algorithm we find that most protein distributions are consistent with a large number of different biochemical rate constant sets. Despite this degeneracy, the solution space of rate constants almost always informs on underlying mechanism. For example, we distinguish between regimes where transcriptional bursting occurs from regimes reflecting constitutive transcript production. Our method agrees with the current standard approach, and in the restrictive regime where the standard method operates, also identifies rate constants not previously obtainable. Even without making any assumptions we obtain estimates of individual biochemical rate constants, or meaningful ratios of rate constants, in 91% of tested cases. In some cases our method identified all of the underlying rate constants. The framework developed here will be a powerful tool for deducing the contributions of particular molecular mechanisms to specific patterns of gene expression. PMID:24811315
NASA Astrophysics Data System (ADS)
Meher, J. K.; Das, L.
2017-12-01
The Western Himalayan Region (WHR) was subject to a significant negative trend in the annual and monsoon rainfall during 1902-2005. Annual and seasonal rainfall change over WHR of India was estimated using 22 rain gauge station rainfall data from the India Meteorological Department. The performance of 13 global climate models (GCMs) from the coupled model intercomparison project phase 3 (CMIP3) and 42 GCMs from CMIP5 was evaluated through multiple analysis: the evaluation of the mean annual cycle, annual cycles of interannual variability, spatial patterns, trends and signal-to-noise ratio. In general, CMIP5 GCMs were more skillful in terms of simulating the annual cycle of interannual variability compared to CMIP3 GCMs. The CMIP3 GCMs failed to reproduce the observed trend whereas 50% of the CMIP5 GCMs reproduced the statistical distribution of short-term (30-years) trend-estimates than for the longer term (99-years). GCMs from both CMIP3 and CMIP5 were able to simulate the spatial distribution of observed rainfall in pre-monsoon and winter months. Based on performance, each model of CMIP3 and CMIP5 was given an overall rank, which puts the high resolution version of the MIROC3.2 model (MIROC3.2 hires) and MIROC5 at the top in CMIP3 and CMIP5 respectively. Robustness of the ranking was judged through a sensitivity analysis, which indicated that ranks were independent during the process of adding or removing any individual method. It also revealed that trend analysis was not a robust method of judging performances of the model as compared to other methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chien, C; Elgorriaga, I; McConaghy, C
2001-07-03
Emerging CMOS and MEMS technologies enable the implementation of a large number of wireless distributed microsensors that can be easily and rapidly deployed to form highly redundant, self-configuring, and ad hoc sensor networks. To facilitate ease of deployment, these sensors should operate on battery for extended periods of time. A particular challenge in maintaining extended battery lifetime lies in achieving communications with low power. This paper presents a direct-sequence spread-spectrum modem architecture that provides robust communications for wireless sensor networks while dissipating very low power. The modem architecture has been verified in an FPGA implementation that dissipates only 33 mWmore » for both transmission and reception. The implementation can be easily mapped to an ASIC technology, with an estimated power performance of less than 1 mW.« less
NASA Technical Reports Server (NTRS)
Schenker, Paul S. (Editor)
1992-01-01
Various papers on control paradigms and data structures in sensor fusion are presented. The general topics addressed include: decision models and computational methods, sensor modeling and data representation, active sensing strategies, geometric planning and visualization, task-driven sensing, motion analysis, models motivated biology and psychology, decentralized detection and distributed decision, data fusion architectures, robust estimation of shapes and features, application and implementation. Some of the individual subjects considered are: the Firefly experiment on neural networks for distributed sensor data fusion, manifold traversing as a model for learning control of autonomous robots, choice of coordinate systems for multiple sensor fusion, continuous motion using task-directed stereo vision, interactive and cooperative sensing and control for advanced teleoperation, knowledge-based imaging for terrain analysis, physical and digital simulations for IVA robotics.
Distribution-dependent robust linear optimization with applications to inventory control
Kang, Seong-Cheol; Brisimi, Theodora S.
2014-01-01
This paper tackles linear programming problems with data uncertainty and applies it to an important inventory control problem. Each element of the constraint matrix is subject to uncertainty and is modeled as a random variable with a bounded support. The classical robust optimization approach to this problem yields a solution with guaranteed feasibility. As this approach tends to be too conservative when applications can tolerate a small chance of infeasibility, one would be interested in obtaining a less conservative solution with a certain probabilistic guarantee of feasibility. A robust formulation in the literature produces such a solution, but it does not use any distributional information on the uncertain data. In this work, we show that the use of distributional information leads to an equally robust solution (i.e., under the same probabilistic guarantee of feasibility) but with a better objective value. In particular, by exploiting distributional information, we establish stronger upper bounds on the constraint violation probability of a solution. These bounds enable us to “inject” less conservatism into the formulation, which in turn yields a more cost-effective solution (by 50% or more in some numerical instances). To illustrate the effectiveness of our methodology, we consider a discrete-time stochastic inventory control problem with certain quality of service constraints. Numerical tests demonstrate that the use of distributional information in the robust optimization of the inventory control problem results in 36%–54% cost savings, compared to the case where such information is not used. PMID:26347579
Digital Mapping of Soil Organic Carbon Contents and Stocks in Denmark
Adhikari, Kabindra; Hartemink, Alfred E.; Minasny, Budiman; Bou Kheir, Rania; Greve, Mette B.; Greve, Mogens H.
2014-01-01
Estimation of carbon contents and stocks are important for carbon sequestration, greenhouse gas emissions and national carbon balance inventories. For Denmark, we modeled the vertical distribution of soil organic carbon (SOC) and bulk density, and mapped its spatial distribution at five standard soil depth intervals (0−5, 5−15, 15−30, 30−60 and 60−100 cm) using 18 environmental variables as predictors. SOC distribution was influenced by precipitation, land use, soil type, wetland, elevation, wetness index, and multi-resolution index of valley bottom flatness. The highest average SOC content of 20 g kg−1 was reported for 0−5 cm soil, whereas there was on average 2.2 g SOC kg−1 at 60−100 cm depth. For SOC and bulk density prediction precision decreased with soil depth, and a standard error of 2.8 g kg−1 was found at 60−100 cm soil depth. Average SOC stock for 0−30 cm was 72 t ha−1 and in the top 1 m there was 120 t SOC ha−1. In total, the soils stored approximately 570 Tg C within the top 1 m. The soils under agriculture had the highest amount of carbon (444 Tg) followed by forest and semi-natural vegetation that contributed 11% of the total SOC stock. More than 60% of the total SOC stock was present in Podzols and Luvisols. Compared to previous estimates, our approach is more reliable as we adopted a robust quantification technique and mapped the spatial distribution of SOC stock and prediction uncertainty. The estimation was validated using common statistical indices and the data and high-resolution maps could be used for future soil carbon assessment and inventories. PMID:25137066
Scaling range sizes to threats for robust predictions of risks to biodiversity.
Keith, David A; Akçakaya, H Resit; Murray, Nicholas J
2018-04-01
Assessments of risk to biodiversity often rely on spatial distributions of species and ecosystems. Range-size metrics used extensively in these assessments, such as area of occupancy (AOO), are sensitive to measurement scale, prompting proposals to measure them at finer scales or at different scales based on the shape of the distribution or ecological characteristics of the biota. Despite its dominant role in red-list assessments for decades, appropriate spatial scales of AOO for predicting risks of species' extinction or ecosystem collapse remain untested and contentious. There are no quantitative evaluations of the scale-sensitivity of AOO as a predictor of risks, the relationship between optimal AOO scale and threat scale, or the effect of grid uncertainty. We used stochastic simulation models to explore risks to ecosystems and species with clustered, dispersed, and linear distribution patterns subject to regimes of threat events with different frequency and spatial extent. Area of occupancy was an accurate predictor of risk (0.81<|r|<0.98) and performed optimally when measured with grid cells 0.1-1.0 times the largest plausible area threatened by an event. Contrary to previous assertions, estimates of AOO at these relatively coarse scales were better predictors of risk than finer-scale estimates of AOO (e.g., when measurement cells are <1% of the area of the largest threat). The optimal scale depended on the spatial scales of threats more than the shape or size of biotic distributions. Although we found appreciable potential for grid-measurement errors, current IUCN guidelines for estimating AOO neutralize geometric uncertainty and incorporate effective scaling procedures for assessing risks posed by landscape-scale threats to species and ecosystems. © 2017 The Authors. Conservation Biology published by Wiley Periodicals, Inc. on behalf of Society for Conservation Biology.
Miniature Dual-Corona Ionizer for Bipolar Charging of Aerosol
Qi, Chaolong; Kulkarni, Pramod
2015-01-01
A corona-based bipolar charger has been developed for use in compact, field-portable mobility size spectrometers. The charger employs an aerosol flow cavity exposed to two corona ionizers producing ions of opposite polarity. Each corona ionizer houses two electrodes in parallel needle-mesh configuration and is operated at the same magnitude of corona current. Experimental measurement of detailed charge distribution of near-monodisperse particles of different diameter in the submicrometer size range showed that the charger is capable of producing well-defined, consistent bipolar charge distributions for flow rates up to 1.5 L/min and aerosol concentration up to 107 per cm3. For particles with preexisting charge of +1, 0, and −1, the measured charge distributions agreed well with the theoretical distributions within the range of experimental and theoretical uncertainties. The transmission efficiency of the charger was measured to be 80% for 10 nm particles (at 0.3 L/min and 5 μA corona current) and increased with increasing diameter beyond this size. Measurement of uncharged fractions at various combinations of positive and negative corona currents showed the charger performance to be insensitive to fluctuations in corona current. Ion concentrations under positive and negative unipolar operation were estimated to be 8.2 × 107 and 3.37 × 108 cm−3 for positive and negative ions; the n·t product value under positive corona operation was independently estimated to be 8.5 × 105 s/cm3. The ion concentration estimates indicate the charger to be capable of “neutralizing” typical atmospheric and industrial aerosols in most measurement applications. The miniature size, simple and robust operation makes the charger suitable for portable mobility spectrometers. PMID:26512158
Digital mapping of soil organic carbon contents and stocks in Denmark.
Adhikari, Kabindra; Hartemink, Alfred E; Minasny, Budiman; Bou Kheir, Rania; Greve, Mette B; Greve, Mogens H
2014-01-01
Estimation of carbon contents and stocks are important for carbon sequestration, greenhouse gas emissions and national carbon balance inventories. For Denmark, we modeled the vertical distribution of soil organic carbon (SOC) and bulk density, and mapped its spatial distribution at five standard soil depth intervals (0-5, 5-15, 15-30, 30-60 and 60-100 cm) using 18 environmental variables as predictors. SOC distribution was influenced by precipitation, land use, soil type, wetland, elevation, wetness index, and multi-resolution index of valley bottom flatness. The highest average SOC content of 20 g kg(-1) was reported for 0-5 cm soil, whereas there was on average 2.2 g SOC kg(-1) at 60-100 cm depth. For SOC and bulk density prediction precision decreased with soil depth, and a standard error of 2.8 g kg(-1) was found at 60-100 cm soil depth. Average SOC stock for 0-30 cm was 72 t ha(-1) and in the top 1 m there was 120 t SOC ha(-1). In total, the soils stored approximately 570 Tg C within the top 1 m. The soils under agriculture had the highest amount of carbon (444 Tg) followed by forest and semi-natural vegetation that contributed 11% of the total SOC stock. More than 60% of the total SOC stock was present in Podzols and Luvisols. Compared to previous estimates, our approach is more reliable as we adopted a robust quantification technique and mapped the spatial distribution of SOC stock and prediction uncertainty. The estimation was validated using common statistical indices and the data and high-resolution maps could be used for future soil carbon assessment and inventories.